xbbtagger

To use the necessary python packages on lxplus, use conda!

Preprocessing

Log onto a suitable Eddie node

$ ssh -Y <UUN>@eddie3.ecdf.ed.ac.uk
$ qlogin -pe sharedmem 4 -l h_vmem=10G

Clone the xbbtagger code

$ cd <YOUR-WORKING-DIRECTORY>
$ git clone git@github.com:asogaard/xbbtagger.git
$ cd xbbtagger

In the future, you can use the provided script to log onto Eddie:

$ source login.sh --help
$ source login.sh gpu 20gb

Setup the environment

$ # Install conda...
$ conda env create -f Environments/xbbtagger.yml
$ source activate xbbtagger

Go into the Preprocessing folder

$ cd Preprocessing

Copy over the data

$ mkdir -p /exports/eddie/scratch/<UUN>/xbbtagger/preprocessing/{input,output}
$ ln -s /exports/eddie/scratch/<UUN>/xbbtagger/preprocessing/input  input
$ ln -s /exports/eddie/scratch/<UUN>/xbbtagger/preprocessing/output output
$ scp -r <USERNAME>@lxplus.cern.ch:/afs/cern.ch/work/a/asogaard/public/xbbtagger/input/* /exports/eddie/scratch/<UUN>/xbbtagger/preprocessing/input/

Run the preprocessing code

$ python preprocessing.py --masscut --ttbar | tee log_preprocessing.out
$ python reweighting.py             --ttbar | tee log_reweighting_0.out
$ python reweighting.py   --pt-flat --ttbar | tee log_reweighting_1.out
$ python preparing.py               --ttbar | tee log_preparing.out
$ ls -lrt output/

Training

Log onto a suitable Eddie node

$ ssh -Y <UUN>@eddie3.ecdf.ed.ac.uk
$ qlogin -pe gpu 2 -l h_vmem=40G

Setup the environment

$ conda env create -f Environments/xbbtagger-gpu.yml
$ module load cuda
$ source activate xbbtagger-gpu

Go into the Training folder

$ cd ../Training

Run the training code using TensorFlow (GPU should automatically be inferred if available)

$ KERAS_BACKEND=tensorflow python btagging_nn.py --input_file ../Preprocessing/output/prepared_sample_v2.h5 --batch_size=8192

or using Theano on GPU

MKL_THREADING_LAYER=GNU THEANO_FLAGS=device=cuda,floatX=float32 python btagging_nn.py --input ../Preprocessing/output/prepared_sample_v2.h5 --batch_size=8192

If you want to train multiple classifier on individual pT-slices, please read Parameterized Machine Learning for High-Energy Physics by Baldi et al. (2016). If you still want to do it, you can run e.g.

$ KERAS_BACKEND=tensorflow python btagging_nn.py --pt-slice 200 300

which saves the trained model in a unique directory (KerasFiles/*__pT_200_300GeV/) which allows you to easily distinguish different models.

To run everything, from reweighting to training, in one go, you can use the provided script

$ source run.sh

Jupyter notebook on Eddie

Launch a qlogin session, e.g.

$ qlogin -pe gpu 2

or

source login.sh gpu 20g

Once on the qlogin node, run

$ ssh -NR localhost:8882:localhost:8888 login04 &

Run

$ jupyter notebook --no-browser

This will give you an output of text that will look like:

...
[C 11:59:59.468 NotebookApp]

   Copy/paste this URL into your browser when you connect for the first time,
   to login with a token:
       http://localhost:8888/?token=e325bef2289fc2ce991f61a28e36b66c38314f596af30a5f

You can ignore all of it apart from the last line (the URL), which you will need to paste into a web browser at a later step.

Leave that terminal open and in a different terminal window on your local computer run:
```
$ ssh -NL localhost:8888:localhost:8882 <UUN>@login04-ext.ecdf.ed.ac.uk
```
This will prompt you for your eddie password - enter it there.
Open a web browser and paste the URL that was generated in step 3 into your browser. This will start a jupyter notebook in your browser which will contain the contents of the directory you launched it from on eddie.

Name		Name	Last commit message	Last commit date
Latest commit History 62 Commits
Environments		Environments
Notebooks		Notebooks
Preprocessing		Preprocessing
Training		Training
.gitignore		.gitignore
README.md		README.md
login.sh		login.sh
run.sh		run.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Environments

Environments

Notebooks

Notebooks

Preprocessing

Preprocessing

Training

Training

.gitignore

.gitignore

README.md

README.md

login.sh

login.sh

run.sh

run.sh

Repository files navigation

xbbtagger

Preprocessing

Training

Jupyter notebook on Eddie

About

Releases

Packages

Contributors 2

Languages

asogaard/xbbtagger

Folders and files

Latest commit

History

Repository files navigation

xbbtagger

Preprocessing

Training

Jupyter notebook on Eddie

About

Resources

Stars

Watchers

Forks

Languages