The first pass at this project was a crude python script that looped over the CraigslistHousing generator, converted results to a pandas dataframe, then send out an email. I manually ran this program on a daily-ish basis from 2020-01-15 till 2020-03-20, prior to my search for a new apartment.
The second iteration on the project is deploying a reworked version to airflow running on a Raspberry Pi (in progress).
.
├── README.md
├── agg_analysis
│ ├── agg_csvs_script.py
│ └── myconfigs.py
├── dag_clist.py
├── images
├── src
│ ├── main_clist.log
│ ├── main_clist.py
│ ├── main_email.py
│ ├── module_clist
│ │ ├── __init__.py
│ │ └── collect_clist.py
│ ├── module_email
│ │ ├── README.md
│ │ ├── __init__.py
│ │ └── email_configs.py
│ └── module_utils
│ ├── __init__.py
│ ├── function_logger.py
│ └── s3_funks.py
└── sys_setup
├── pi_files_example.sh
└── scp_zip_example.sh
Analysis included the images sent out in every email along with a Tableau dashboard
An interesting observation is that about 80% of posts are automatically re-posted every day for visability -- filtering out these reposts as noise allowed for me to identify the more desirable apartments.
It appears as though listings -- given my search parameters and filtering for the 15 most common neighborhood names -- have dropped in avg/median price by about 50% and increased in quantity by about 10x!
Tableau Public 2020 Tableau Public 2021
- obligatory blog post of project
- add CI (Travis or Jenkins)
aggregate analysis/report- conditional dag for daily and aggregate (every 2 weeks) reports
- move py configs to cfg file at the repo level
Airflow DAG to automate data collectionadd logging in place of print statementssave snapshot tables asparquetcsv to S3
- parameterize craigslist search
- develop report into more formal pdf (incorporate Tableau dashboard if possible)
- update Tableau public dashboard (last done 2021-01-18)
- convert dashboard over to Superset
- deployt airflow dag to raspberry pi
make sure send_email method can handle w/ and w/o image attachments
- simple UI for new searches and implementation
- managed recipient list via Google Group