Skip to content

BFSSI-Bioinformatics-Lab/miseq_portal

Repository files navigation

Portal Documentation

Overview

The MiSeq Portal was built with Django, PostgreSQL, RabbitMQ, Celery and Django REST Framework

Administrator Usage

Admins and staff have the ability to upload entire runs to the MiSeq portal database. This is done via the miseq_uploader app (http://192.168.1.202:8000/miseq_uploader/).

To do this, the user must be logged into the host machine (genomcis-portal-dev on Proxmox). From here, locally stored runs can be easily uploaded via the 'Upload MiSeq Run' button on the webpage.

The run to be uploaded must have the same folder structure as a local MiSeq run. If the run is only available on BaseSpace, it must be retrieved with BaseMountRetrieve. This structures the run in a format that the portal expects when parsing and uploading run data (i.e. reads, InterOp, stats, logs). The user must supply the full path to the run.

Bioinformatics dependencies

The genomics portal can do quite a bit of bioinformatics and thus requires a bunch of related packages to be installed.

  • skesa
  • Prokka
  • mash
  • ConFindr
  • BBMap suite (e.g. sendsketch.sh, bbmap.sh, bbduk.sh)
  • RGI
  • Qualimap
  • Quast

Several of these can be installed via conda and should be given their own isolated conda environment, otherwise you'll likely run into issues (especially with Prokka)

Celery + RabbitMQ

Computationally heavy tasks are offloaded to the server via Celery and RabbitMQ. Tasks are created with the @shared_task decorator and are detected by Celery. The task queue is managed by the broker, RabbitMQ.

Celery is distributed with this project, though RabbitMQ must be installed and set up separately.

Installing and configuring RabbitMQ

sudo apt install rabbitmq-server

Setting up RabbitMQ with a miseq_portal user can be done with the following commands.

sudo rabbitmqctl add_user miseq_portal <password_goes_here
sudo rabbitmqctl add_vhost miseq_portal_vhost
sudo rabbitmqctl set_user_tags miseq_portal administrator
sudo rabbitmqctl set_permissions -p miseq_portal_vhost miseq_portal ".*" ".*" ".*"

RabbitMQ can be monitored via the RabbitMQ Management plugin. The web interface should be accessible via 0.0.0.0:15672 with the login credentials specified above.

sudo rabbitmq-plugins enable rabbitmq_management

Configuring Celery

Note: The following tasks are handled automatically - see the Supervisor section below.

A Celery worker must be launched in order to watch for incoming tasks. Bugs will occur if the concurrency parameter != 1.

celery -A miseq_portal.taskapp worker -l INFO -E --concurrency 1

Celery can be monitored via flower. This package is distributed alongside this project. The following command will launch a web interface that will be accessible via 0.0.0.0:5555.

celery -A miseq_portal.taskapp flower

Data Backup/Storage

All user supplied Runs are stored on the BMH-WGS-Backup NAS (https://192.168.1.176:5001/), which is mounted on the host machine (/mnt/MiSeqPortal).

This is the MEDIA_ROOT as specified in config.settings.base:

MEDIA_ROOT = "/mnt/MiSeqPortal"

Redundant backups

The uploaded runs are also backed up to the Wolf_Station NAS (https://192.168.1.205:5001) into the BMH-WGS-Backup-MiSeqPortal shared folder. This is done via the DSM Hyper Backup application and occurs once every week (Sunday evening).

Supervisor

The supervisor can be easily installed with the following command: sudo apt install supervisor

The supervisor is used to keep the following four processes alive:

  1. manage.py runserver
  2. celery (assembly queue)
  3. celery (analysis queue)
  4. flower

See the contents of the following config file for details: /etc/supervisor/conf.d/miseq_portal.conf. Upon making changes to this config file, be sure to run the following command: sudo supervisorctl reread; and sudo supervisorctl update

A local web interface for supervisor is available at 0.0.0.0:9001. The configuration for this interface can be found at /etc/supervisor/supervisord.conf, specifically under the [inet_http_server] heading. The status of each process, as well as live updating logs can be viewed here.

Additional logs:

/var/log/celery_assembly.err.log

/var/log/celery_assembly.out.log

/var/log/celery_analysis.err.log

/var/log/celery_analysis.out.log

/var/log/flower.err.log

/var/log/flower.out.log

About

Internal web portal for analyzing and retrieving NGS data

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages