Skip to content

ML based Reconstruction for HgCal Data.

Notifications You must be signed in to change notification settings

lgray/hgcal-reco

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

90 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

HGCal Recostruction

ML based Reconstruction for HgCal Data. A tracker for work done at Fermilabs.

Current:
Experiments with end-to-end reconstruction/segmentation/regression model design, with a simpler dataset for proof of concept.
Dataset: kaggle page
Experiments: one_shot_tests

Archive:
Presentation Slides
Code
Experiment Sheets

Index

Summary

An attempt to learn algorithms that group detector signatures from the same particle together, and, synthesize it into physically meaningful quantities facilitating the study of the properties of LHC particle collisions in the CMS High-Granularity Calorimeter.

The High Granularity Calorimeter is used to record particle hits as they flow/ disintegrate/ shower through the it from the LHC. Calorimetry - the process of measuring the amount of heat released or absorbed during a chemical reaction - is used to record a particle's signature at each hit through the HgCal layers. The amount of energy lost along with timestap and exact spatial coordinates are recorded which detail out the evolution of the showers.

An attemp needs to be made to reduce the computation and memory requirements of the reconstruction algorithm along with imporving accuracy and discrimination capability. Little is known about how we would reconstruct (hypothetical/ new!) non-standard model particles that make weird signatures in the calorimeter, so highlighting the non-confirming particles/hits as a seperate class in the learning algorithm would also be an interesting capability.

Ground-truth-labelling: An event consists of a collection of hits over a 25 ns window. Each event is considered as a data sample unit and could have upto 20k particle hits.
EdgeNet: ...
Union-find Segregation: ...
Dynamic Reduction Net: ...


Tentative Roadmap

  • Complete retraining of the pipeline

    • Setup Google Cloud VM
      • Transition to Pytorch 1.5.1 and CUDA 10.2 and corresponding pytorch geometric verison
    • Understand existing pipleine
      • Breakdown data preprocessing and compute statistics
      • Retrain the entire pipline - EdgeNet and Dynamic Reduction Net
  • Optimize Existing Pipeline

    • Hyperparameter optimization for Segmentation (EdgeNet) and Pooling (Dynamic Reduction Net)
      • Optimal Neighbours for Graph CNN Accumulation
      • Optimal number of Pooling Layers
      • Mean pool v/s Avg Pool
  • Explore Alternate Reconstruction Methods

    • End-to-end gradient flow with the current pipeline
      • Infuse graph Segregation in the deep network for gradient flow from Reduction all the way to Segmentation
    • Energy Regression
      • Attention/ transformer based single shot GNN
      • Draw from Object Condensation (by Jan K.)
      • ASAP - Supervised clustering without a limit on K

Resources and References

  1. The standard Model https://home.cern/science/physics/standard-model
  2. GNN Papers:
  3. Deployment:

About

ML based Reconstruction for HgCal Data.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 71.5%
  • Jupyter Notebook 28.5%