Skip to content


Folders and files

Last commit message
Last commit date

Latest commit


Repository files navigation

Korpannoteringslabbet (Corpus Annotation Laboratory)

This project is split into three parts, the wsgi backend, the web frontend, and the catapult which runs a python instance that shares the loaded lexicon, keeps malt processes running and lowers the python interpreter startup time.

This project depends on the Corpus Pipeline's python scripts and Makefiles, available from Språkbanken.


The backend is hosted at demo in /export/htdocs_sb/annoteringslabb, and its subdirectory pipeline contains hosts running and completed builds. It is available at

If the backend is run with the python interpreter directly, it will try to use the eventlet wsgi implementation if it exists. Otherwise it falls back on the reference implementation. Eventlet is preferred because it handles concurrent requests.


The index.wsgi has two variables at the top that needs to be configured:

  • paths: where the python paths to the sb python directory and the directory of the backend is, and

  • log_file_location: location of the log. it can be omitted to log to stdout.

The rest of the configuration is in, the most important setting is the pipeline's directory, and location of the settings schema.


The catapult is running on demo, in the /home/dan/annotate/webservice/catapult/ directory.

Scripts are run on the catapult with the tiny c program catalaunch, which is built by issuing make in the catapult directory.


Directories are hard-coded into the keep alive script, and the start server script and the saldo update script Go nuts!


The frontend is hosted at k2 and available at


The config.js file contains the configuration of the backend's address, and also the address to Karp.

Settings JSON Schema

The backend creates the makefile from a JSON object that must satisfy the JSON schema stored in the backend. The frontend builds its form based on this schema. New entries can be added and hopefully the frontend will render them somewhat ok. The file that creates the makefile from is backend/

Cron jobs

The scripts mentioned here in the catapult directory have some absolute addresses that needs to be configured.

The catapult is kept alive with catapult/ and restarts it with catapult/ if it does not respond to ping.

The optional script catapult/ updates saldo. This takes some time, and is therefore run during the night. The catapult is restarted afterwards by the script.

Builds that have not been accessed for 24 hours are removed every midnight by issuing

The cronjobs are in catapult/cronjobs, and looks like this:

1 0 * * * curl
*/5 * * * * /home/dan/annotate/webservice/catapult/
1 3 * * * /home/dan/annotate/webservice/catapult/