Keywords: Diffusion-Weighted MRI, Fiber Tracking, Machine Learning, Brain
This repository provides a utility to perform rapid prototyping of data-driven tractography models.
First download the subject files from Humman connectom project to the hcp_zips
folder. Each subject has a naming similar to 917255_3T_Diffusion_preproc.zip
where 917255
is the ID of the subject. In addition, save the corresponding
gold standard track files as a zip file called hcp_trks.zip
.
The hcp_zips
folder will be similar to this:
project/
├──hcp_zips/
| └──644044_3T_Diffusion_preproc.zip
| └──992774_3T_Diffusion_preproc.zip
| └──hcp_trks.zip
where hcp_trks.zip
is the compression of a folder similar to:
HCP105_Zenodo_NewTrkFormat/
├──644044/
| └──tracts
| └──CA.trk
| └──CC.trk
| ...
├──992774/
| └──tracts
| └──CA.trk
| └──CC.trk
| ...
Then you can simply run the following script. This will unpack dwi and tracts, merge them, subsample them, resample them and finally get FOD:
./prepare_hcp.sh
Simply run ./prepare_ismrm.sh gt
.
Using the notebooks in this repository assumes a certain file structure:
project/
├──models/
├──subjects/
| └──992774/
| ├──tracts/
| ├──resampled_fibers/
| ├──predicted_fibers/
| ├──samples/
| ├──seeds/
| └──fod.nii.gz
├──entrack/
| ├──inference_t.ipynb
| └──training_conditional_t.ipynb
├──fiber_resampling.ipynb
├──generate_conditional_samples_t.ipynb
└──trk2seeds.ipynb
The file system is described shortly, for more details please refer to the individual files.
Contains keras models trained in training_conditional_t.ipynb, along with yaml configs, which contain the training parameters.
This is the data folder comprising DWI data, training fibers, and their processed forms. Specifically, we use T1 and DWI data from the HCP project, and reference fibers from TractSeg.
- tracts/ Raw fiber bundles (.trk), downloaded from TractSeg.
- resampled_fibers/ Fibers with data (.trk), interpolated by fiber_resampling.ipynb.
- predicted_fibers/ Fibers (.trk), predicted by inference_t.ipynb.
- samples/ Collections of (vin, D, vout) samples (.npz), produced by generate_conditional_samples_t.ipynb.
- seeds/ Fiber seed files (.npy), generated by generate_conditional_samples_t.ipynb.
- fod.nii.gz Additional DWI data, from HCP project.
Folder for a specific model class, in this case Entrack. It defines how its models are trained (training_conditional_t.ipynb), and how they are used to predict fibers (inference_t.ipynb).
Notebook to resample fibers, and calculate local fiber geometry data, such as tangent, curvature, and torsion.
Notebook which defines the generation of samples
Small utility notebook to convert the fiber endpoints in a .trk file to seed coordinates (.npy).
Throughout this project, we exclusively use the load/save functions from nipy.streamlines. For .trk files, they follow the TrackVis convention, that the origin of the stored fiber coordinates is the corner of the first voxel.
Only upon load they are transformed to the NIfTI convention, such that the origin of the loaded fibers is the center of the first voxel. More precisely, half a voxel is subtracted from the stored coordinates.
When saved, the fiber coordinates are changed back to the TrackVis convention, i.e. half a voxel is added to the loaded fiber coordinates.
However, the Tractometer scoring tool uses the deprecated nibabel.trackvis.read, which does not shift the fiber coordinates. Instead, the Tractometer script adds half a voxel to the stored fiber coordinates, so it implicitly assumes that the stored fiber coordinates follow the NIfTI convention!
Unfortunately, this does not comply with fibers loaded and saved by nibabel.streamlines.save, so we modified the Tractometer behavior at scoring/challenge_scoring/io/streamlines.py (line 99), and set the shift to 0.0 instead of +0.5.
-
Data Preprocessing
-
Model Training (train.py):
Posterior agreement can only be computed for the Entrack model (Entrack.yml), as it is the only model with a temperature.
The training parameter callbacks:AutomaticTemperatureSchedule:n_checkpoints controls the granularity of the model checkpoints over the defined temperature range (logarithmic spacing of checkpoints).
Later, the number of model checkpoints also defines the granularity of the PA evaluations.
- Predictions (optimal_temperature.py):
This script just takes all the N model checkpoints from before and uses each of them to make 2N predictions, two for each model. One prediction pair consists of a .trk file for a subjects scan (e.g. subjects/917255/fod_norm.nii.gz), and its repeated scan (subjects/917255retest/fod_norm.nii.gz).
The resulting 2N .trk files are used to compute the PA value for each of the N temperatures.
- Posterior Agreement (agreement.py)
Using the config file generated by optimal_temperature.py, the script agreement.py computes the PA values for all the specified pairs.
Using the 2N trk files from step 2. we can generate a visualization of the predicted fibers as a function of the precision (similar to fig. 5.11 in my thesis).