!!! Development will be continued over here. !!!
Developed in association with Uni-Graz.
Idea: Lisa Eckerstorfer MSc.
Supervisor: Univ.-Prof. Dr.phil. Dipl.-Psych. Katja Corcoran
Software-Design: Philipp Feldner (Computer Science Student TU Graz)
Twitter/Telegram: @PhilippFeldner
Email: feldnerphilipp@gmail.com
Telegram API: https://github.com/python-telegram-bot/python-telegram-bot
This is a chat-bot for the messaging app Telegram. It is intended to conduct surveys/studies that strech over longer time periods. Since there was no similar tool around to conduct those type of studies we developed one on our own. The messaging app Telegram seemed fitting, because of its decent API and popularity. The bot is written in python3 and uses this API.
Here is the official telegram info about bots and how you can set up your own.
- Multi language support
- Studies are created in the easy to use json format
- Variable question-block scheduling
- Custom keyboards
- Condition based questions
- Timezone support
- Backup database
- Data storing in easy to use csv files
To allow quick study creation I designed a format for question input that is pretty easy to use and is based on the json notation. Json is a markup language similar to XML, that is easy to read for the human and easy to translate for the machine. The whole format is basically build as follows: We have multiple blocks that represent days. Within those days we have smaller chucks that represent question-blocks that shall be scheduled through out that day. Within those question-blocks are the questions with a bunch of meta information that is explained later.
Currently there is support for englisch, german, french and spanish, but with little python knowledge you will be able to add/remove other languages. For every language create a file with the name format question_set_en.json (en/de/es/fr) and place those files in the survey folder.
Span a pair of [ ] from the very beginning of the file to the very end! Within those braces place all the day elements. within { } and separate them with a colon.
- day: Represents the day of the study. Make sure to put the day elements in ascending order.
- blocks: Contains a list of block elements.
Example day:
{
"day": 1,
"blocks":
[
{
-- Fill in block element --
},
{
-- Fill in block element --
}
]
}
- time: currently this element awaits a keyword, that is defined in SCHEDULE_INTERVALS in the admin/settings.py file. There you can define your own time intervals in the format ["hh:mm", "hh:mm"]. Make sure to make your interval at least 30min long. An option to schedule at a certain time may be implemented in the future.
- settings: Here certain properties for a block can be set.
List of current options:
- MANDATORY: Marks the block as mandatory. The next block is only getting scheduled when the current block is complete or a question contains the command: Q_ON
- questions: contains all the question elements of the block.
Example block:
{
"time": "SAMPLE_TIME_KEYWORD",
"settings": [["MANDATORY"]],
"questions":
[
{
-- Fill in question element --
},
{
-- Fill in question element --
}
]
}
- text: contains the message text, that shall be asked
- choice: contains nested lists of answer possibilities which are used to create a custom keyboard within Telegram. Dynamic keyboards can be created within the python file survey/keyboard_presets.py and have to be registered in the CUSTOM_KEYBOARD dictionary.
- condition: Conditions can be used to give questions certain requirements. If a user does not fulfill given requirements the question will be skipped for him. Multiple conditions are possible.
- condition_required: Previously defined conditions can be put here and if the user does not fulfill them the question will be skipped for him.
- commands: Commands are basically signals for the program to trigger special
events. List of all (current) commands and their usage:
- FORCE_KB_REPLY: The user has to choose an option from the Keyboard to proceed.
- Q_ON: See BLOCK settings - MANDATORY
- COUNTRY: Signals that the user will respond with his country: Relevant for database.
- AGE: Signals that the user will respond with his age: Relevant for database.
- GENDER: Signals that the user will respond with his gender: Relevant for database.
- TIMEZONE: Signals that the user will respond with his timezone: Relevant for database.
- ["DATA", "DATA_NAME", "COMMAND"]: Signals that a certain operation shall be executed onto a custom datastructure. (See Survey Specific Functions).
Example question:
{
"text": "Sample Text",
"choice": [
["Sample Choice 1"],
["Sample Choice 2"]
],
"condition_required": ["#IDENTIFIER"],
"condition": [["Sample Choice 1", "#IDENTIFIER"]],
"commands": [["FORCE_KB_REPLY"],["COUNTRY"]]
}
Within the admin package you find a file settings.py. This file is responsible for most of the settings you can take. Custom keyboards are defined in survey/keyboard_presets.py
- ADMINS: A list with all admin chat_ids. Admin features are in development.
- DEBUG: Debug mode on/off (bool). For debugging purposes. If activated it also stores the debug info to a log.txt file.
- DELETE: Activates (bool) the /delete_me command that allows the user to withdraw himself from the study and deletes all his records, including csv and database entries.
- QUICK_TEST: For testing purposes. If set to a different value than False it reduces the time of the scheduling blocks to n seconds.
- DEFAULT_LANGUAGE: Default language if something goes wrong. It is important that this language exists. Otherwise the program might crash.
- DEFAULT_TIMEZONE: Default timezone if something goes wrong. It is important that this timezone is a timezone defined in pytz. Otherwise the program might crash.
- SCHEDULE_INTERVALS: A python dictionary that maps the Keywords from the time value (question-blocks) to a interval format like this: ["hh:mm","hh:mm"] with a minimum offset from 30min.
- INFO_TEXT: A python dictionary that maps the info text for the /info to a language abbreviation.
- STOP_TEXT: A python dictionary that maps the stop text for the /stop to a language abbreviation.
The survey/keyboard_presets.py file is meant to contain custom keyboards that can be generated dynamically.
Example: timezone keyboards
The user enters his current location (country) and this value is stored
to the DB. Afterwards the timezone keyboard gets generated from a dictionary
that maps every country and its (possibly multiple) timezones. When creating
dynamic keyboards register them in the CUSTOM_KEYBOARD dictionary and I advice
you to use the prefix KB_ for easier recognition.
If your survey requires custom replies you can generate them via your own python functions.
- Step 1: Gathering Data: Likely you want to make your questions depended on data from the users. Therefore every participant has their own dictionary of datastructures. Using the command field withing a question allows to create/delete a new datastructure. Use the format ["DATA", "DATA_NAME", "COMMAND"]. "DATA" is the key to recognize your intention. A unique "DATA_NAME" shall be chosen to identify the datastructure. "COMMAND" shall be replaced by either "ADD" or "CLR"(clear) to add the user response to the datastructure or delete the datastructure entirely.
- Step 2: Defining your own functions: Custom functions shall be
defined within admin/survey_specific.py. Every functions needs to
have a unique string as identifier to invoke them within the json
file. An example of how a function shall be registered:
It makes sense to pass the parameters data and user to access user specific data. The return value needs to be stringified (str()) since it is going to be part of a message.
# Register your own functions here! # Define them above. def survey_function(user, data, function): if function == "baseline": return baseline_(data, user) elif function == "another_function": return another_function_(data, user)
- Step 3: Invoking a function within the json files: <<DATA|data_name|function_name>> Placing this within a question message will invoke your function and replace it with your return value from this specific function.
Emojis can simply be added as Unicode symbols into your text. Link.
To avoid data loss on restart I run a small sqlite3 database in the background, that stores the most basic attributes of a user.
CSV (comma separated values) is a spreadsheet format that is suited very well for storing survey data. Every user has his own sheet that is named after his chat_id and stored in survey/data_incomplete. As soon as a user has completed the survey the file gets copied to survey/data_complete.
This program was developed with the Pycharm IDE by Jetbrains and tested on a local machine. The telegram API is https based so you might need to check if necessary ports are open.