Rippy

Rip-it with Rippy

Introduction

Rippy is a downloader designed to scrape websites using a real web browser to find, e.g. video or downloadable files. The targets are website that try to be scrape-resistant and where other downloaders had to give up.

The magic is that Rippy uses a real browser it controls so a lot of the normal anti-bot designs are inefficient, e.g. scrambling javascript. To block Rippy you will have to block browsers. I also enjoy a blocking arms-race, keeps my day bright and fulfilled.

Installation

Currently the only distribution method officially provided is the docker-compose way but all it really requires is Chrome and Python.

wget https://github.com/JohnDoee/rippy-docker/raw/master/docker-compose.yml

You should edit docker-compose.yml. The following values should be changed

/tmp/media should be changed to where you want rippy to download data, it is in the file twice.
BASIC_AUTH_PASSWORD should be changed to a unique password
SECRET_KEY should be changed to something unique
Optional: Change RIPPY_CONCURRENCY to how many scrape and download threads you want to have.

docker-compose up -d

Usage

Head over to http://ip:51359 and add a job. It should start downloading or prompt you to do something manually.

If the status text says “Waiting” it means you need to open the browser and fill in a captcha or something alike. If you are using the docker-compose setup there should be a button in the upper-right corner of the website to open the browser. It will open a new window with a VNC to the hosted Chromium browser.

New scrapers

Feel free to request a new scraper but there are a few requirements if you want me to implement them: They are scrape resistant, as in, nobody else should be able to download. Check out tools like youtube-dl and JDownloader first. They should not be using an encryption or behind paywall, i.e. I can’t do stuff like netflix (something like that is also not the target at all)

Currently a generic video-site scraper is on the slab as this project is a merge between a reddit post and a generic video-site scraper

Accompanied repositories

Docker-compose file and docker chromium repository

Rippy webinterface

FAQ

Q: My tab crashed or elements on the website crashed, what should I do?
A: Close the tab, rippy should notice it shortly and try again.

TODO

[ ] Add (semi-)generic view player extractor
[ ] Return (potentially proxied) URL to video instead of downloading

Supported sites

Avgle

Docker images

Main backend component (this repository)

Webapp and reverse proxy

Chrome accessible via VNC

Logo / icon

frog by habione 404 from the Noun Project

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
main		main
rippy		rippy
.dockerignore		.dockerignore
.gitignore		.gitignore
CHANGELOG.rst		CHANGELOG.rst
Dockerfile		Dockerfile
README.rst		README.rst
logo.png		logo.png
manage.py		manage.py
requirements.txt		requirements.txt
setup.py		setup.py
start-supervisor.sh		start-supervisor.sh
supervisor.conf		supervisor.conf

fakegit/rippy

Folders and files

Latest commit

History

Repository files navigation

Rippy

Introduction

Installation

Usage

New scrapers

Accompanied repositories

FAQ

TODO

Supported sites

Docker images

Logo / icon

License

About

Resources

Stars

Watchers

Forks

Languages