Skip to content

danryu/bbripper

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

15 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

bbripper

Project to scrape/rip certain content from the web

Requirements:

  • Jython
  • Java 7 JRE
  • mozilla firefox (current version)
  • scrapy 0.16 or later
  • sikuli 1.0.0 or later
  • ImageMagick + textcleaner
  • tesseract-ocr
  • pdfocr (modified) + option-modifier script

Hardware/OS requirements:

  • Linux initial support (ubuntu/unity), should work on Windows and Mac too
  • approx 200GB disk space (possibly more)
  • possible integration with VPS/cloud servers

Objective:

Running:

  • from ./sikuli_api/ ./sikuli-script -r ../workspace/bbripper/sikuli.sikuli

About

Project to scrape/rip certain content from the web

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published