Skip to content
This repository has been archived by the owner on Dec 4, 2018. It is now read-only.

nhejazi/stat159-repro-datasci

 
 

Repository files navigation

UC Berkeley's Statistics 159/259

Project Group Gamma, Fall Term 2015

Group members: Nima Hejazi, Feng Lin, Luyun Zhao, & Xinyue Zhou

Topic: Working Memory in Healthy and Schizophrenic Individuals

Build Status Coverage Status

Directions/Roadmap

Recommended Steps

  1. make requirements - pip install all packages listed in requirements.txt and ggplot==0.6.8
  2. make test - runs all of the tests for scripts used in this analytic project
  3. make coverage - runs coverage tests and generates the Travis coverage report
  4. make data - downloads all data for analysis (around 2 hours on a 25MBps internet due to size)
  5. make validate - validate checksums of the data to ensure integrity
  6. make analysis - generates results and figures referenced in the report
  7. make paper - compiles the (final) full paper describing our findings

Alternative Steps

Please read the section Data (below) if you plan to use these (alternative) steps

  1. make requirements - pip install all packages listed in requirements.txt and ggplot==0.6.8
  2. make test - runs all of the tests for scripts used in this analytic project
  3. make coverage - runs coverage tests and generates the Travis coverage report
  4. make conditionfiles - downloads the condition files necessary for analysis (takes up to 8 hours)
  5. make data - downloads the rest of the data for analysis (around 2 hours on a 25MBps internet due to size)
  6. make validate - validate checksums of the data to ensure integrity
  7. make analysis - generates results and figures referenced in the report
  8. make paper - compiles the (final) full report describing our findings

Backups and Issues

If there are errors in the process of reproducing the paper, a copy of the final version of the paper (report_backup.pdf) was generated by us and is stored in the paper/ directory. You are welcome to report any issues and suggestions on our GitHub repository so that we may further improve this work!

requirements

Requirements are specified in .travis.yml and requirements.txt. On top of that, ggplot==0.6.8 is required but is not supported by travis for now. Hence, it is installed by a separate pip install command in 'make requirements'

Codebase

Utility (utils) methods may be found in the subdirectory code/utils. Analysis scripts are in the code/ directory.

Tests and Coverage

'make test' runs all the tests for this project repository. Methods were extracted into individual modules inside the utils/ subdirectory under code/. Plotting helper functions remain inside the analysis scripts and are not tested.

'make coverage' runs the coverage tests and generates the coverage report.

Data

'make data' downloads all the data except for the condition files. Running this takes around 2 hours on average.

Directory data/ is initially empty, except for 'net_roi.txt', which is from the supplemental material of the reference paper and may be found at this link. We manually extracted the information from Table S1 and make it available here as 'net_roi.txt'.

'make conditionfiles' downloads all of the necessary condition files for the analysis.

The reason that downloading the condition files has been made into a separate command is due to the fact that the condition files are bundled together with the fMRI raw data on the OpenFMRI site. Due to this, it would take several hours to download these files. Since we only need the condition files and not the rest of the data in the bundle, we have created the separate command to hasten the download process. Also, to improve accessibility, we made the decision to include the condition files (in TXT format) in the data/ directory in the repository. For reproducibility, you can run this command to obtain the same condition files used in the analysis reported.

Analysis

'make analysis' runs all the analysis scripts and fills up the results/ directory with diagrams and graphs, which are referenced in the full report.

Paper

'make paper' produces the full report describing this research project.

Contributers

Nima Hejazi: nhejazi

Feng Lin: LiamFengLin

Luyun Zhao: lynnzhao

Xinyue Zhou: z357412526

Very special thanks to Jarrod Millman, Matthew Brett, Ross Barnowski, and J-B Poline, for your teaching and your invaluable advice over the course of the semester — this project would not have been possible without your help!

About

🎒 course project for Reproducible and Collaborative Statistical Data Science, UC Berkeley

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 62.3%
  • TeX 28.3%
  • Makefile 9.4%