Skip to content

ahlfors/AdsCrawler

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Requirements

  1. tld

AdsCrawler

A crawler which crawls and analyzes ads in the web page. Once we find the ads, We can generate ABP filters automatic.

Common Ads model

A lot of ads have the model of the below. ' ' The refer URL is the URL of page which hosts the ads.

How to find the image ads in internet.

  1. Crawl the web and profile the image objects into database.
  2. Analyze the profiles and find the images which are probably ads.
  3. Generate ABP filters from ads image profile.

Ads Profile Table

|----------------------------------------------------------------------------------------------------|
|record id (primary key)| ads URL | ads Target URL | refer URL |
|----------------------------------------------------------------------------------------------------|

About

A crawler which crawl and analyze ads in the webpage.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 100.0%