GitHub - BFreund88/WZ_VBS

Branches Tags

Name		Name	Last commit message	Last commit date
Latest commit History 43 Commits
ControlPlots		ControlPlots
Inputs		Inputs
OutputModel		OutputModel
OutputRoot		OutputRoot
logs		logs
.gitignore		.gitignore
Apply_BDT.py		Apply_BDT.py
Apply_NN.py		Apply_NN.py
OPT_VBS_BDT.py		OPT_VBS_BDT.py
OPT_VBS_NN.py		OPT_VBS_NN.py
README		README
common_function.py		common_function.py
config_OPT_BDT.py		config_OPT_BDT.py
config_OPT_NN.py		config_OPT_NN.py
launch_BDT.sh		launch_BDT.sh
launch_NN.sh		launch_NN.sh
setup.sh		setup.sh
submit_BDT.sh		submit_BDT.sh
submit_NN.sh		submit_NN.sh

Repository files navigation

A WZ->lvll VBS resonance selection using neural networks/BDT.
There are two programs for each method:
First for Neural Networks:
- config_OPT_NN.py
Contains all relevant parameters and a list of input variables as well as the list of samples used
as background and as signal.

- OPT_VBS_NN.py:
Trained with signals and backgrounds, the NN are implemented using the Keras package. They need simple
ntuples as inputs containing all the necessary variables like Mjj, Detajj etc. The background is a
combination of the SM WZ QCD and WZ EW processes and the signal are either the combined GM H5 samples with
masses ranging from 200 to 900 GeV or the HVT signal samples ranging from 250 to 1000 GeV.

This is a simple fully connected NN, the principal hyper parameters are the number of layers, number of
neurons per layer, the learning rate and momentum. The input is split into a training and validation set
(70%/30%). After each epoch the accuracy is measured on the validation set and only the best performance
is saved. Each signal mass is assigned a label corresponding to the resonance mass. The background events
have a randomnly assigned label, taken of the same probability distribution as the signals. This should
allow an optimal performance for all resonance masses. All Hyperparameters can be specified (see help).

- Apply_NN.py:
This applies the trained NN selection to a given dataset, for example mc16a datasets. It takes as input
the best performing NN found with OPT_VBS.py and the model that is used (GM or HVT). The list of samples
the NN is applied on is specified in the config file. It then creates new samples in the
output directory OutputRoot which are identical to the input files as well as an added variable
probSignal which is the output of the NN. These ntuples can then be used to select the optimal cut on the
NN output.

And for BDTs:
- OPT_VBS_BDT.py:
The BDTs are implemented with sklearn. The Hyper parameters are: base_estimator forming the bossted ensemble,
the learning rate and the boosting algorithm.

- Apply_BDT.py:
Similar to Apply_NN this program applies the BDT selection to a given dataset. The output variable is
added to the original ntuple and stored in a new root file.