Skip to content

a python3 crawler for the Redfin.

Notifications You must be signed in to change notification settings

wangff/RedfinCrawler

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 

Repository files navigation

Redfin Crawler

a scraper for Redfin.com using Python3 for relevant real estate information for recently sold homes.

Methodology

Escape from blacklist

Three methods are introduced to prevent getting blacklisted while scraping.

  1. random sleep
  2. random User-Agent
  3. using different proxies

Parser tools

  1. RedfinDownloadParser: parse the cvs download from the page. if it exists, download and parse, otherwise, return fail.
  2. RedfinPageParser: if the cvs isn't provided, the html page will be parsed. And should handle multiple pages.

TODO:

  1. Modular
  2. Request response for different HTTP Code
  3. Test
  4. SQL persistants

About

a python3 crawler for the Redfin.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages