The Newsapps Boundary Service

The Boundary Service is a ready-to-deploy system for aggregating regional boundary data (from shapefiles) and republishing that data via a RESTful JSON API. It includes an extensive demo site which demonstrates the functionality of the API using client-side javascript.

This project is aimed at providing a simple service for newsrooms, open-government hackers and others to centralize and build on regional GIS data. You can see the instance we’ve configured for Chicago & Illinois, along with much more detailed information about the API at http://boundaries.tribapps.com/.

Local setup

To bootstrap the boundary service locally you’ll first need to create a virtualenv and install the requirements. Using virtualenvwrapper that looks like this:

cd boundaryservice
mkvirtualenv boundaryservice
pip install -r requirements.txt

You’ll need to have postgres running locally, then you can then fast-track through database setup by running:

fab local_bootstrap

Finally, you will need to load some data. The project comes with one example dataset. To load all available datasets you should run:

./manage load_shapefiles

In the future, you can load only specific shapefiles by passing the -o flag with a comma-delimited list of Boundary Sets (whitespace in the the Boundary Set’s names is collapsed). You can also reload boundary sets (clearing existing data) by passing the -c flag. Combine the two flags to clear and reload only specific datasets.

To run the service locally you will need to run two commands in separate terminals, one to run the api and another to run the demo application.

./runapi

./manage runserver

Once those commands are running you should be able to visit http://localhost:8000/ to see the demo site, which will be making AJAX requests to http://localhost:8001/1.0/ for boundary data. To see the API in action try a url such as http://localhost:8001/1.0/boundary-set/.

Remote setup (deployment)

The Boundary Service is based on the most recent iteration of our newsapps Django project layout, (which is now part of Gareth Rushgrove’s django-project-tempalates). Using this base for a project provides extensive infrastructure for doing robust deployments including isolated configuration, stable branch handling, deployment of static assets to S3, and an extensive fabric deployment script.

Thus, there are three options for deployment:

Our project template is tightly integrated with our GeoDjango Amazon EC2 image. So for the quickest possible deployment, you can deploy to that image and be up and running very quickly.

Alternatively, you should be able to modify the appropriate paths in both the relevant configuration files, found in boundaryservice/configs and in fabfile.py to deploy the site to any architecture.

If neither of those ideas appeal to you, it should be reasonably straight-forward to simply remove the “api” app from this project and migrate it into your own project template. (Note: this will make us cry.)

Adding data

To add data to the Boundary Service you will first need to add a shapefile and its related files (prj, dbf, etc.) to the data/shapefiles directory. See data/shapefiles/neighborhoods for an example shapefile.

Once your data is in place, you will modify data/shapefiles/definitions.py to add a declaration for your new shapefile to the SHAPEFILES dictionary. The Chicago neighborhoods example includes extensive commenting describing the various fields and how they should be populated. Note that the Boundary Service will normally be able to infer the projection of your shapefile and automatically transform it to an appropriate internal representation.

Of particular note amongst the fields are the ‘ider’ and ‘namer’ properties. These should be assigned to functions which will be passed a feature’s attributes as a dictionary. ‘ider’ should return a unique external id for the feature. (e.g. a district id number, geographic id code or any sequential primary key) Whenever possible these ids should be stable across revisions to the dataset. ‘namer’ should return a canonical name for the feature, not including its kind. (e.g. “Austin” for the Austin Community Area, “Chicago” for the City of Chicago, or #42 for Police Beat #42) A number of callable classes are defined in data/shapefiles/utils.py, which should mitigate the need to write custom functions for each dataset.

Once definitions.py has been saved the new shapefile can be loaded by running:

./manage load_shapefiles -o BoundaryKindWithoutWhitespace

The “-c” parameter can also be passed to clear existing boundaries of only the specified type and then load the data. Multiple boundaries can be cleared and loaded by passing a comma-separated list to “-o”.

As a matter of best practice when shapefiles have been acquired from government entities and other primary sources it is advisable not to modify them before loading them into the Boundary Service. (Thus why the Chicago neighborhoods shapefile is misspelled “Neighboorhoods”.) If it is necessary to modify the data this should be noted in the ‘notes’ field of the shapefile’s definitions.py entry.

Credits

The Boundary Service is a product of the News Applications team at the Chicago Tribune. Core development was done by Christopher Groskopf and Ryan Nagle.

License

MIT.

Name		Name	Last commit message	Last commit date
Latest commit History 142 Commits
boundaries		boundaries
data		data
newsapps		newsapps
.gitignore		.gitignore
AUTHORS		AUTHORS
COPYING		COPYING
README.textile		README.textile
fabfile.py		fabfile.py
gzip_assets.py		gzip_assets.py
manage		manage
manageapi		manageapi
requirements.txt		requirements.txt
runapi		runapi
s3exclude		s3exclude

License

banterability/boundaryservice

Folders and files

Latest commit

History

Repository files navigation

The Newsapps Boundary Service

Local setup

Remote setup (deployment)

Adding data

Credits

License

About

Resources

License

Stars

Watchers

Forks