libcrowds-analyst

A headless web application to help with real-time analysis of LibCrowds results.

Recieves webhooks from a PyBossa server and analyses task runs according to the rules set out for that project (see Analysis). The task's result is updated accordingly.

To facilitate reproducible research, since v3.0.0 the DOI of the version of LibCrowds Analyst and the enpoint used for analysis are added to each result.

Requirements

PyBossa >= 1.2.0.
A running Redis server.

Build setup

# Install dev packages
sudo apt-get install libxml2-dev libxslt-dev python-dev lib32z1-dev

# Install LibCrowds Analyst
virtualenv env
source env/bin/activate
python setup.py install

# Run
python run.py

# Test
python setup.py test

For deployment using nginx, uwsgi and supervisor some basic templates are provided in the contrib folder.

Configuration

Make a local copy of the configuration file to change the default settings:

cp settings.py.tmpl settings.py

The important settings to maintain are:

# URL of the PyBossa server
ENDPOINT = 'http://{your-pybossa-server}'

# DOI of the current LibCrowds Analyst version
DOI = ''

Analysis

Following is the analysis procudure for each project.

While the preferred way of analysis is setting up webhooks using the endpoints listed below there are cases where you may want to trigger analysis manually. This can be done by adding project_short_name={short_name} as a URL parameter to any of the endpoints listed below. This will retrieve all results for a project and add them to the analysis queue. Exercise extreme caution here, this will overwrite all results currently stored for that project and there is no undo!

Convert-a-Card

WEBHOOK ENDPOINT: /convert-a-card?api_key={your-api-key}

All task runs are compared looking a match rate of at least 70% for the answer keys oclc and shelfmark (disregarding task runs where all answer fields have been left blank).

If a match is found the result associated with the task is updated to the matched answer for each key and analysis_complete will be set to True.

If all keys for all answers have been left blank the result will be set to the empty string for each key and analysis_complete will be set to True.

For all other cases the result will be set to the empty string for each key and analysis_complete will be set to False. These are the results that will have to be checked manually, after which analysis_complete should be set to True.

Example result info

{
  "comments": "",
  "shelfmark": "15673.d.13",
  "oclc": "865706215",
  "doi": "10.5281/zenodo.890858",
  "analysis_complete": true
}

In the Spotlight: Selections

WEBHOOK ENDPOINT: /playbills/select?api_key={your-api-key}

The annotations for all task runs are compared. Those with similar selection rectangles are clustered and analysis_complete is set to True.

Example result info

{
  "annotations": [],
  "doi": "10.5281/zenodo.890858",
  "analysis_complete": true
}

Name		Name	Last commit message	Last commit date
Latest commit History 426 Commits
contrib		contrib
libcrowds_analyst		libcrowds_analyst
test		test
.gitignore		.gitignore
.travis.yml		.travis.yml
LICENSE		LICENSE
README.md		README.md
app_context_rqworker.py		app_context_rqworker.py
pytest.ini		pytest.ini
run.py		run.py
settings.py.tmpl		settings.py.tmpl
settings_test.py		settings_test.py
setup.cfg		setup.cfg
setup.py		setup.py

License

LibCrowds/libcrowds-analyst

Folders and files

Latest commit

History

Repository files navigation

libcrowds-analyst

Requirements

Build setup

Configuration

Analysis

Convert-a-Card

Example result info

In the Spotlight: Selections

Example result info

About

Resources

License

Stars

Watchers

Forks

Languages