Skip to content

kazeevn/lhcb_trigger_ml

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

lhcb_trigger_ml

Friendly machine learning for LHCb experiment. Project should enable one to train and compare classifiers on some training dataset.

The programming language is python, the analysis is performed in IPython notebooks - commonly used in machine learning interactive shell for python, which is good for development, analysis and presenting results (plots, histograms and so on)

Brief demos:

Main points

  • working on uniform classifiers - the classifiers with low correlation of predictions and mass (or some other variable(s))
    • MSE - the measure of uniformity
    • uBoost optimized implementation inside
    • uniformGradientBoosting (with different losses, specially FlatnessLoss is very interesting)
  • parameter optimization
    See grid_search module, there is a simulated annealing-like optimization of parameters on dataset, this optimization can be performed on cluster.
  • plots, plots, plots
    See reports module, it is a good way to visualize learning curves, roc curves, flatness of predictions on variables.
  • there is also procedure to generate toy Monte-Carlo in toymc module
    (generates new set of events based on the set of events we already have with same distribution) and special notebook 'ToyMonteCarlo' to demonstrate and analyze its results.
  • parallelism
    ClassifiersDict from reports can train classifiers on IPython cluster,
    uBoost is quite slow, and it has built-in parallelism option: different BDTs inside uBoost can be trained parallelly in cluster.

###Getting this to work To run most the notebooks, only IPython and some python libraries are needed.

To run example notebooks on some machine, one should have

  • IPython
  • Some python libraries that can be installed using any package manager for python (apt-get will work too, but Ubuntu repo contains quite old versions of libraries), better use pip

The libraries you need are numpy, scipy, pandas, scikit-learn, rootpy, root-numpy and maybe something else, basically the packages are installed via command-line:

sudo pip install numpy scipy pandas scikit-learn rootpy root-numpy

IPython can be installed via pip as well

sudo pip install ipython

To run IPython, there is shell script in IpythonWorkflow/ subfolder

In order to work with ROOT files, you need CERN ROOT, make sure you have it by typing 'root' in the console

###Roadway: We are going to publish notebook on some server to provide easy access from any machine.

Some tests with different decays will be published soon.

About

LHCb trigger based on machine learning research

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 99.9%
  • Shell 0.1%