The UCR Clearance Parser powers the NPR Visuals crime clearance lookup tool by processing raw FBI UCR clearance data and generating an agency lookup file and JSON files for each law enforcement agency.
Requires PostgreSQL. See the NPR Visuals guide to set up a Mac development environment with PostgreSQL.
pip install -r requirements.txt
git submodule update --init
./process.sh
This generates approximately 22,000 JSON files of the form <ori7>.json
(e.g. NY03030.json
) and agency_names.csv
in the output
directory.
Requires AWS environment variables to be set.
./deploy.sh
- Crosswalk file converted from Stata to CSV using R from Law Enforcement Agency Identifiers Crosswalk, 2005.
- The JSON writer --
write_clearance_json()
inparse.py
-- is quite ugly. If you need to extend the JSON output, consider refactoring this function. Pull requests encouraged! - The
parse()
functionparse.py
is a handy, fast parser for raw FBI UCR clearance data files, known as "master" files. data/UCR52406-2013.txt
is the FBI master agency list as exported from the UCR system. It was not used in our final product, but might be useful.- The scripts and processed output contains median clearance rates based on population bucket. These medians are technically correct but not necessarily reliable. This is because there are many zeroes in the clearance data that bias the medians. The zeroes are ambiguous: they could be because the agency did not report, because their data was rejected by the FBI, or because they did no clear any cases. Use with care.
MIT licensed, see LICENSE for details.