billyziege/pipeline_project
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
Quick Intro A keyed object gets its own csv file via Object Relational Mapping, and I try not to touch these except through tools. This is what I call the mockdb. All the information for all objects of that class are stored there. For each class, I have an object that is essentially a dictionary of the following form {key:object}. This allows me to keep the information in memory keyed to the primary object key. Some objects, like Case and Control, are "children" of a higher category object (Sample), and I keep track of children through class inheritance (I also can figure out the ancestors and progenitor). All children can be loaded for a given object, if requested. This is how we could write RapidRunSequencing and HighThroughputSequencing objects, and either load them individually or pull them together in the parent object SequencingRun, which could also have a separate csv file for data where the run type is unknown. Like relational databases, the key for another object is often stored within the data for a given object. For instance, a barcode-object has information specifying both the sample to which it was attached and the lane in which it was run. This approach also makes data extraction very intuitive. Processes, such a zcat or bcbio-gen, are also keyed objects and stored in a similar fashion. Thus to run the pipeline, I just have to load in the data to the correct objects, link the objects, and tell the process object to run.
About
Scripts that wrap and store information for a NGS pipeline.
Resources
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published