pygama

Python based package for data processing and analysis

installation

Install on local systems with:

$ git clone [url]
$ pip install -e pygama

Installation at NERSC:

pip install -e pygama --user

Uninstall: pip uninstall pygama

To run pygama at NERSC (and set up JupyterHub), we have additional instructions at this link.

overview

pygama is a python package for:

converting physics data acquisition system output to "lh5"-format hdf5 files
performing bulk digital signal processing on time-series data
optimizing those DSP routines and tuning associated analysis parameters
generating and selecting high-level event data for further analysis

The basic steps for a typical analysis are:

Convert DAQ output to "raw" lh5 format using daq_to_raw
Browse data in the lh5 files to verify its integrity
Run raw_to_dsp on the raw files to generate "dsp" lh5 output a. Optimize the DSP parameters b. Build an analysis parameter database storing the optimized parameters c. Re-run with the optimized DSP routines
Run dsp_to_hit (or create your own version) to generate hit files from the dsp data
Run hit_to_evt to generate files with event structures
Concatenate / join / filter evt and raw-dsp-hit data to extract the fields you need for higher-level analysis

These steps are detailed below.

daq_to_raw

The primary function for DAQ output conversion into raw lh5 files is daq_to_raw in pygama.io. This is a one-to many function: one input DAQ file can generate one or more output raw files. Control of which data ends up in which files, and in which hdf5 groups inside of each file, is controlled via channel groups. If no ch_group is specified, all decoded data should be written to a single output file, with all fields from each hardware decoder in their own output table.

Currently we support only the following hardware:

FlashCams
SIS3302 read out with ORCA
GRETINA digitizer (MJD firmware) read out with ORCA

Partial support is in place but requires updating for

SIS3316 read out with ORCA
SIS3316 read out with llamaDAQ
CAEN DT57XX digitizers read out with CoMPASS

(link to daq_to_raw tutorial)

daq_to_raw to-do's

Time coincidence map generation
Waveform object implementation
fully implement remaining DAQ loops / hardware
add consistency / data integrity checks
documentation
Unit tests
- Generate raw file from a "standard" daq file with expected output to screen
- Check that all expected columns exist, have the right number of rows, and check md5sums of their data
- Add a few additional lh5 fields for lh5 tests
Optimization

lh5 files

Pygama works with "LEGEND HDF5" (lh5) format data. The lh5 specification can be found here. Our python implementation is in pygama.lh5.

lh5 files can be browsed easily in python like any hdf5 file using h5py. (link to lh5 browsing tutorial in lh5 readme... LEGEND collaborators can view J. Detwiler's presentation here)

In addition to the standard h5py interface, pygama provides a WaveformBrowser. (link to waveform browser tutorial... LEGEND collaborators can view I. Guinn's presentation here)

lh5 to-do's

table joins
waveform object definition and compression
fix overwrite
Flecher32 md5sums
unit tests

raw_to_dsp

DSP is performed by extracting a table of raw data including waveforms and passing it to the ProcessingChain. The primary function for DSP is raw_to_dsp.

DSP is controlled via a json-formatted file that sets up which routines can be run, and which parameters are selected for output. See an example dsp.json file here. The DSP can refer to a dictionary of "database" values (see analysis parameters database below).

Available processors include all numpy ufuncs as well as this list of custom processors.

(link to tutorial on dsp)

raw_to_dsp to-do's

SiPM DSP
Resting baseline and PZ correction in trap filters
Improved vectorization
Additional filters
Unit tests
- process a standard input file and check the output
- one unit test for each processor
Optimization

analysis parameters database

The DSP and other routines can make use of an analysis parameters database, which is a json-formatted file read in as a python dictionary. It can be sent to the DSP routines to load optimal parameters for a given channel.

(link to example, tutorial, etc... LEGEND collaborators can view J. Detwiler's example here)

dsp optimization

DSP optimzation

(LEGEND collaborators can view J. Detwiler's example here)

parameter tuning

TBD

dsp_to_hit

TBD

hit_to_evt

TBD

automation

TBD

Name		Name	Last commit message	Last commit date
Latest commit History 1,001 Commits
apps		apps
docs		docs
experiments		experiments
pygama		pygama
tutorials		tutorials
.gitattributes		.gitattributes
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
clear_cache.sh		clear_cache.sh
setup.py		setup.py

License

clarkm1811/pygama-1

Folders and files

Latest commit

History

Repository files navigation