Skip to content

LucoLab/Villemin_2020

Repository files navigation

cell2patient Logo

A cell-to-patient machine learning transfer approach uncovers novel basal-like breast cancer prognostic markers amongst alternative splice variants


All data to reproduce figures can be accessed here : DOI

How to use

Two python(3) scripts are given separately for splicing and expression.

In directory data, you will find the input files.

They are based on the following version of scikit-learn (0.21.2.)

NB: Imputer warnings when script start is not an error.

They call one R script to plot survival over the rounds of classification.

python  classification_cell2patient_splicing.py \
	 -c {absolutepath}/MatriceExonPSI_CellLines.csv \
	 -p {absolutepath}/MatriceExonPSI_Patients.csv \
	 -t 0.6 \
	 -n 1000 \ 
python  classification_cell2patient_expression.py \
	 -c {absolutepath}/MatriceGeneTPM_CellLines.csv \
	 -p {absolutepath}/MatriceGeneTPM_Patients.csv \
	 -t 0.6 \
	 -n 1000 
  • t : Threshold for class probabilities.
  • c : Path to a matrice with Expression/Splicing values for Cell Lines.
  • p : Path to a matrice with Expression/Splicing values for Patients.
  • n : Number of tree in the forest.

The final file annotated is splicing_TCGA_BASAL_HEADER_ADDED.tsv.
You can visualize using https://software.broadinstitute.org/morpheus.

The best features of interest are in outputBorutaPy.txt/.bed.


About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published