Skip to content

cds-amal/addressparser

 
 

Repository files navigation

Build Status

NYC Five Boroughs Address parser

The goal of this project is to be able to identify, parse and geo-encode New York City postal addresses from a plaintext source.

Dependencies

  • Pyhon 2.7.6
  • See Requirements.txt
  • Register your application with NYC Developer Portal and make sure that you check off access to the Geoclient API for the application. Take note of the Application's ID and key. You will not be able to use the ID and key until DoITT approves you -- this could take several days, and you will receive an email when this happens. There isn't any indication of your status on the dashboard, but all requests will return a 403.

Local deployment (on UNIX/MAC)

  • pip install -r requirements.txt
  • Set DOITT environment variables. One way is to create a file and sourcing it:
cat <<EOF > DOITT_ENV
export DOITT_CROL_APP_ID=Your_App_ID
export DOITT_CROL_APP_KEY=Your_App_KEY
EOF

source DOITT_ENV
  • Run the Server
python webserver.py

Testing

This project uses the pytest framework to drive code.

# run all tests
py.test -v

# mark a test a wip (work in progress), an arbitrary name
@pytest.mark.wip

# run tests decorated as wip
py.test -m wip

# decorate a test to be skipped
@pytest.mark.skipif("True")

# test an ad-hoc address from the commandline
python nyctext/adparse.py "Johnson Doe: 1802  OCEAN PARKWAY  BKLYN, NY"

# trace the same ad-hoc address to parsing journey
python nyctext/adparse.py --trace "Johnson Doe: 1802  OCEAN PARKWAY  BKLYN, NY"

Team

Note on Patches/Pull Requests

  • Fork the project.
  • Make your feature addition or bug fix.
  • Add tests to verify your code.
  • Pass your tests and old tests.
  • Send a pull request. Bonus points for topic branches.

Thanks

License

Apache License, Version 2.0

Releases

No releases published

Packages

No packages published

Languages

  • JavaScript 90.7%
  • CSS 4.9%
  • Python 4.1%
  • Other 0.3%