This repo contains scraper code for maintaining a complete copy of all data on Regulations.gov (consisting mainly of Federal Register documents and public comments), extracting text from said documents, and doing named entity recognition (using Oxtail) and plagiarism detection/clustering (using cluster-explorer). Additionally, the project includes scrapers for a couple non-participating agencies, the SEC and CFTC, and shoehorns their content into the Regulations.gov data model.
Scraper of public comments on regulations.gov
License
sunlightlabs/regulations-scraper
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
About
Scraper of public comments on regulations.gov
Resources
License
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published