This package is intended to serve two purposes:
- A packaged implementation in Python of the knockoffs framework introduced by
framework introduced by Dai and Barber 2016 (https://arxiv.org/abs/1602.03589) for variable selection problems with highly correlated features 2. It also has a variety of algorithms for adaptively selecting groupings to maximize power while maintaing FDR control.
This is currently under heavy development (it's in the early stages): docs/tests to come.
- To run all tests, run
python3 -m pytest
- To run a specific label, run
pytest -v -m {label}
. - To select all labels except a particular one, run
pytest -v -m "not {label}"
(with the quotes). - To run a specific file, try pytest test/{file_name}.py. To run a specific test within the file, run pytest test/{file_name}.py::classname::test_method. You also don't have to specify the exact test_method, you get the idea.
- To run a test with profiling, try
python3 -m pytest {path} --profile
. This should generate a set of .prof files in prof/. Then you can run snakeviz filename.prof to visualize the output. There are also more flags/options for outputs in the command line command. - However, this isn't reallyyy recommended - cprofilev is much better.
To run cprofilev, copy and paste the test to proftest/* and then run
python3 -m cprofilev proftest/test_name.py
.
- Gaussian knockoff generator should be class based
- There should be an overarching "sample knockoffs" function where you can put the type of knockoffs you want to sample in as an input argument.
- It would be cool if we moved the KS test code and used it as a method to validate the knockoffs.
- Knockoff Filter + Debiased Lasso
- Need to think about whether we'll actually shift X
- Add hierarchical clustering to ASDP group-making
- Gradient-based method can be sped up
- Add value for rec_prop
- DGP class? instead of returning like 6 things?