Skip to content

nuin/biostar-central

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

BioStar Codebase

Introduction

BioStar codebase is a Python and Django based Q&A web software modeled after the StackOverflow Q&A engine.

Our primary goal is to create a simple, generic, flexible and extendeable Q&A framework.

Requirements

The software requires only Python (2.6 or higher) to run. All other libraries are included in the distribution. The code will run with no changes on any operating system that supports Python.

This software runs the BioStar Bioinformatics Q&A site at: http://www.biostars.org

Installation

Unpack the source code archive. There are a few dependencies that are also included with Biostar. These only need to be installed if you don't already have them on your system. Switch to the libs directory and unpack the docutils.zip and the django.zip archives:

$ cd libs
$ unzip docutils.zip
$ unzip django.zip
$ # switch back to the source directory
$ cd ..

For faster loading performance may also want to unzip the entire libraries.zip file located in the libs folder.

Quickstart

From the command line execute:

$ ./biostar.sh init import run

Visit the http://localhost:8080 to view your site. Enjoy!

Note The Windows version of the biostar.sh manager has not yet been written. The site will work just fine on Windows but for now users will need to manually invoke the commands present in the biostar.sh run manager.

Detailed Usage

There is a main run manager in the root directory:

$ ./biostar.sh 

Execute it with no parameters for information on usage. This run manager can take one or more commands. For example to initialize the database then populate it with the test data and to run the server one would invoke it in the following way:

$ ./biostar.sh init 
$ ./biostar.sh import
$ ./biostar.sh run

Alternatively one may run all these commands all at once:

$ ./biostar.sh init import run

Note: If database models change you must reset and reinitialize the database, note that this will remove all existing content! The database re-initialization is database specific, for the default sqlite deployment you can use:

$ ./biostar.sh delete init import run

The biostar.sh run manager to pulls in environment variables to allow you to customize locations/test fixtures, etc. Edit the biostar.sh script to override the various settings.

Search requires indexing that is disabled during a migration or import. To enable the search you will need to manually trigger the indexing via:

$ ./biostar.sh index

The default server will bind the all IP adapters (0.0.0.0) and port 8080. Visit http://localhost:8080 to see interact with your version of the test server.

There are commands to support Postgresql specific functionality these are:

pgdrop pgdump pgreset pgimport

Most operations are customized via environment variables. To show their current settings use:

$ ./biostar.sh env

Data Migration

To load content from a StackExchange 1 XML datadump one needs to migrate the data into the new schema. This is accomplished via the migrate command:

$ ./biostar.sh migrate

This command in turn invokes the main/migrate.py script. Run this script (note that the Django settings need to be properly set beforehand) with the -h flag to see the flags it can take.:

$ python -m main.migrate.py -h

Note

The migrate command used via the biostar.sh run manager makes use of an in memory database as specified in the conf/memory.env and conf/memory.py files.

The result of a data migration is a compressed json data fixture file that, in turn, may be used via the import command:

$ ./biostar.sh init import index

Account migration

There is an automatic account migration based on the email provided by the OpenID provider. Only the information from a subset of well known OpenID providers are trusted enough to allow automatic account merging. Accepted providers are: Google, Yahoo, Myopenid, LiveJournal, Blogspot, AOL, and Wordpress. For other users manual migration of accounts will be required. Users listed in the Django ADMINS settings will have full administration privileges.

There is a postgresql database management script in conf/pg-manager.sh that is used to facilitate data dumps and restoration.

Environment variables may be used to customize the behavior:

  • `DJANGO_SETTINGS_MODULE`: the configuration module for Django
  • `PYTHON`: the python executable that is to be invoked
  • `FIXTURE`: output path to the (gzipped) file that will contain the data fixture
  • `MIGRATE_PATH`: path to the directory that stores the StackExchange XML dump
  • `MIGRATE_LIMIT`: the number of records to load from the XML dump

For a current Biostar run with about 4K users, 30K posts, 40K edits, 60K votes generates about 300K database entries of various kinds. Data migration into a fixture takes about 1 hour and 10Gb of RAM. This is an area that we could do a lot better job (possibly orders of magnitude better).

The resulting data fixture is database independent and can now be loaded into type database: sqlite, mysql, postgresql supported by Djano. For example when loading into postgresql it takes about 2 hours and 2Gb of RAM.

Note that the databases can be dumped and restored with far fewer resources. Exporting directly into/from postgresql for example takes less than a few minutes.

Testing

Testing also measures code coverage and therefore requires the coverage module. For your convenience this module is included in the libs/libraries.zip archive. Install coverage or unzip the archive.

Testing can be initiated via the biostar.sh run manager:

./biostar.sh test

A reports directory will be created in the root directory that contains html reports on the code coverage by the tests. View the report/index.html file.

Selenium tests can be run via:

./biostar.sh selenium

Please note that for this to work properly the python selenium library bindings must be installed moreover the SELENIUM_TEST_LOGIN_TOKEN variable must be set in your Django settings file:

SELENIUM_TEST_LOGIN_TOKEN = "somepasswordgoeshere"

In addition both the test site and the command line above must make use of the same settings file.

How the site works

Posts may be formatted in Markdown (default) or ReST markup standards. The second format, ReST, will be triggered by starting the post with the .. rest:: directive.

User reputation is a sum of all upvotes and accepted answers that a user accumulates. Note that multiple answers may be accepted on a question, in effect this provides the author of a question to reward twice the excellent answers.

In Biostar there are four types of users: anonymous users, registered users, moderators and administrators.

anonymous users

May browse all content of a site.

registered users

In addition to the privileges that anymous users have registered users may create new posts if their reputation exceeeds a limit (the default is zero), may vote and post answers and comments.

moderators

In addition to the privileges that registered users have moderators may edit, close and delete posts, edit user information (other than email) and may also suspend and reinstate users. All the actions of the moderators may be followed via the Moderator Log page (see About BioStar page for a link)

administrators

In addition to the privileges that moderators have administrators may promote/demote users from having moderator roles. Administrators also have access to the django admin interface where they may perform more database actions than those offered via the BioStar interface..

Content Persistence

Content may be deleted (marked invisible to users) or destroyed (removed from the database).

A post submitted for deletion will be destroyed only if the author requests the deletion and the post does not have any followups (answers/comments) associated with it. Deleted top level posts are marked invisible to regular users.

Code Layout

The Python code, templates, static content (css, images, javascript) and default database are found in the main directory. There is partial datadump of the existing BioStar content in the import folder. The import command will load this data into the current database.

Other Libraries

Biostar is built with open source libraries. The following software packages are used and if necessary included and distributed with BioStar: