Skip to content

cmoresid/yeg-traffic-scrape

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

edmonton-traffic-db

Tool for importing Edmonton traffic volume data into a relational database: http://www.edmonton.ca/transportation/traffic_reports/traffic-volumes-turning-movements.aspx

####Prerequisites Setup You will need to setup some sort of relational database with geospatial capabilities in order to save the results of the extraction process. My suggestion is to use PostGIS which is a geospatial extension to a Postgres database. For Mac OS X, you can use Postgres.app. It includes the PostGIS extensions and the Postgres database.

Once you have your database installed and setup, be sure to update the connection string in the config.py file. See SQLAlchemy Connection String Format for the connection string format.

We make use of several Python modules, all of which can be installed using pip install -U <module_name>. Here are the required Python modules:

  • Crypto
  • geoalchemy2
  • sqlalchemy
  • geojson
  • google-api-python-client
  • httplib2
  • pyOpenSSL

####Prerequisites Setup for Automatic Spreadsheet Download from Google Drive

#####Google Dev Account Setup In order to utilize the automatic spreadsheet download functionality, you will first need to setup a Google Dev Account that has the Google Drive API enabled on it. This step is NOT essential. You can always manually download all the spreadsheets and have the tool process them afterwards.

  1. You will first need to create a project. Follow this link here to create new project.
  2. Under the APIs & auth section, select 'Credentials'.
  3. Select 'Create new Client ID'
  4. Now select 'Installed Application' and select 'Other'.

Note the client ID and client secret. We will be needing them later.

#####Google Drive Setup You will also need add the shared folders containing the traffic event spreadsheets to your Google Drive account. Apparently if you don't add the shared folder to your own account, the shared files will not show up when querying the folder with the API. Go figure...

  1. Open the Traffic Volume Data link.
  2. Select one of the shared folder links (i.e. 001-010, 011-020, etc.
  3. Login to the same account that you used to setup your Google Developer credentials.
  4. Click the 'Add to Drive' button in the top right hand corner.

Repeat this with all the folders.

###How to Download All Traffic Volume Spreadsheets

  1. Complete the prerequisites for automatic download from Google Drive (see above instructions).

  2. Configure you CLIENT_ID and CLIENT_SECRET environmental variables.
    a. In a bash shell, type export CLIENT_ID=<your client ID> in a command prompt to configure your CLIENT_ID. b. In a bash shell, type export CLIENT_SECRET=<your client secret> in a command prompt to configure your CLIENT_SECRET. 3. In the same bash shell you used to configure your environmental variables, type python main.py --download-all --output-dir=<output directory> where <output directory> is the path to the folder where you want to download the files.

    ###How to Import Traffic Volume Spreadsheets From Directory

    1. Download some or all of the Traffic Volume Spreadsheets either automatically or manually from the City of Edmonton's website and place them into a directory. (The 'data' folder for example that is located in the source.)
    2. In a bash shell, type python main.py --import-file=data.

    ####Caveats There are some spreadsheets that will cause problems with the importing process. Some of the spreadsheets have date cells that are mislabeled, which will clash with traffic events in other spreadsheets. These cases will be logged in a file named error_log.txt. You will have to perform manual intervention in order to remedy the problem.

    ###To-Do

    1. Match site locations with co-ordinates.

About

Scrape YEG Traffic Spreadsheets on Google Drive

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages