Skip to content

Scrapes the search result pages of https://zoek.officielebekendmakingen.nl/ using Scrapy, and downloads the XML documents it may find along it's way.

Notifications You must be signed in to change notification settings

justinvw/officiele-bekendmakingen-scraper

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Officiële bekendmakingen scraper

Author: Justin van Wees (justin@vwees.net)
Date: 2010-06-21

Officiële bekendmakingen scraper scrapes the search result pages of https://zoek.officielebekendmakingen.nl/ and downloads the XML documents it may find along it's way.

Requirements

Installation and configuration

After you've made sure that all the required Python packages are installed, please edit "officielebekendmakingen/settings.py". The settings should be self explanatory.

Running Officielebekendmakingen scraper

Run python scrapy-ctl.py crawl zoek.officielebekendmakingen.nl

You can monitor the Scrapy process by visiting http://[HOSTNAME]:6080 or by opening a Telnet session to port 6023 (the "stats" object contains information about the current run)

About

Scrapes the search result pages of https://zoek.officielebekendmakingen.nl/ using Scrapy, and downloads the XML documents it may find along it's way.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages