dantonnoriega/zillow-projects
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
Read Me for Zillow Project Codes (0) initialize_zillow.do * This .do file does the following: * - makes a master list of property IDs with addresses etc * - stacks all the property attibute data and then merges the master property list (0.1) zillow_property_list.do * this program is used to make a unique property list * this is done by doing the following... * (1) removing duplicate properties * (2) removing any spaces and extra symbols in "housenumber" and "streetsuffix" * (3) isolating problematic properties. often there are non numbers in the numeric categories. * (4) broaden criterion and refine again (1) zillow_trimdown.do * this code times down the zillow data to useable observations * (1) merge data with analysis table. we then remove unneeded variables. * (2) split data into numeric and string values * (3) take the numeric data set and use it to tag unlikely single-family homes (SFH) (2) zillow_76_to_text.do * this code does a quick clean of atype 76 * (1) keep only atype 76. remove all non alpha-numeric characters. replaces with spaces. * (2) export text file (3) run_zillow_ngrams.py * simple program that calls the function "zillow_ngrams.py" * depends on: - zillow_ngrams.py main workhorse program. tokenizes every word (uni, bi, and tri grams) in the text file created by zillow_76_to_text.do - replace_spanish.py replaces all spanish characters with english versions - remove_html.py removes html formatting - nolla_lang_detect.py inspired by a program written by a dude last named "Nolla". it scans text and detects if its english, spanish, or bi lingual. (4) zillow_logit.do * imports data, trims counts, then runs logit on some bigrams
About
projects using zillow data
Resources
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published