Website: http://www2.county.allegheny.pa.us/RealEstate/Search.aspx
- assessment/assessments.py - Pulls the data from the County/Zillow to be analyized
- assessments/ml - Multiple scripts to analyze the data
pip install -r requirements.txt
python setup.py install
- For more details on distutils: https://docs.python.org/3/distutils/introduction.html
-
Optional - Sign up for a Zillow Token: https://www.zillow.com/howto/api/APIOverview.htm
-
Create a text file of housing ID's you are interested in pulling from the site, only the ID is necessary, such as assessments.txt.
0000-S-00000-0000-01 0000-S-00000-0000-02
- Alternative: Pass a list of comma separated IDs: 0000-S-00000-0000-01,0000-S-00000-0000-02,...
-
Extract the data with assessment/assessments.py :
python -m assessment.assessments --parcels assessments.txt --zwid <Zillow Token>
Notes about this module:
-
Caches the website in a ./data/:
- Limits the number of times you hit the website
- TODO: Cache data in a Database (probably DynamoDb)
-
Outputs the file to assessments.csv:
- The current output is a CSV
- TODO: Create a data store (again probably DynamoDb)
-
-
Analyze the data using sklearn - samples scripts are here: assessments/ml