ML based Reconstruction for HgCal Data. A tracker for work done at Fermilabs.
Current:
Experiments with end-to-end reconstruction/segmentation/regression model design, with a simpler dataset for proof of concept.
Dataset: kaggle page
Experiments: one_shot_tests
Archive:
Presentation Slides
Code
Experiment Sheets
An attempt to learn algorithms that group detector signatures from the same particle together, and, synthesize it into physically meaningful quantities facilitating the study of the properties of LHC particle collisions in the CMS High-Granularity Calorimeter.
The High Granularity Calorimeter is used to record particle hits as they flow/ disintegrate/ shower through the it from the LHC. Calorimetry - the process of measuring the amount of heat released or absorbed during a chemical reaction - is used to record a particle's signature at each hit through the HgCal layers. The amount of energy lost along with timestap and exact spatial coordinates are recorded which detail out the evolution of the showers.
An attemp needs to be made to reduce the computation and memory requirements of the reconstruction algorithm along with imporving accuracy and discrimination capability. Little is known about how we would reconstruct (hypothetical/ new!) non-standard model particles that make weird signatures in the calorimeter, so highlighting the non-confirming particles/hits as a seperate class in the learning algorithm would also be an interesting capability.
Ground-truth-labelling: An event consists of a collection of hits over a 25 ns window. Each event is considered as a data sample unit and could have upto 20k particle hits.
EdgeNet: ...
Union-find Segregation: ...
Dynamic Reduction Net: ...
-
Complete retraining of the pipeline
- Setup Google Cloud VM
- Transition to Pytorch 1.5.1 and CUDA 10.2 and corresponding pytorch geometric verison
- Understand existing pipleine
- Breakdown data preprocessing and compute statistics
- Retrain the entire pipline - EdgeNet and Dynamic Reduction Net
- Setup Google Cloud VM
-
Optimize Existing Pipeline
- Hyperparameter optimization for Segmentation (EdgeNet) and Pooling (Dynamic Reduction Net)
- Optimal Neighbours for Graph CNN Accumulation
- Optimal number of Pooling Layers
- Mean pool v/s Avg Pool
- Hyperparameter optimization for Segmentation (EdgeNet) and Pooling (Dynamic Reduction Net)
-
Explore Alternate Reconstruction Methods
- End-to-end gradient flow with the current pipeline
- Infuse graph Segregation in the deep network for gradient flow from Reduction all the way to Segmentation
- Energy Regression
- Attention/ transformer based single shot GNN
- Draw from Object Condensation (by Jan K.)
- ASAP - Supervised clustering without a limit on K
- End-to-end gradient flow with the current pipeline
- The standard Model https://home.cern/science/physics/standard-model
- GNN Papers:
- The "paper" https://arxiv.org/abs/1611.08097
- Review paper: https://arxiv.org/abs/1812.08434
- Networks typically used:
- An interesting specialized loss - OBJECT CONDENSATION https://arxiv.org/abs/2002.03605
- More specific example papers- Graph Neural Networks for Particle Reconstruction in High Energy Physics detectors https://arxiv.org/abs/2003.11603
- Some interesting directions to go:
- Covariant Compositional Networks For Learning Graphs https://arxiv.org/abs/1801.02144
- ASAP: Adaptive Structure Aware Pooling for Learning Hierarchical GraphRepresentations https://arxiv.org/abs/1911.07979
- Juniper (Dianna's Suggestion) Binary Junipr achieves state-of-the-art performancefor quark/gluon discrimination and top-tagging. https://arxiv.org/pdf/1906.10137.pdf
- Deployment: