Skip to content
This repository has been archived by the owner on Oct 26, 2022. It is now read-only.

mixmoe/DanbooruSpider

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

35 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

DanbooruSpider

An efficient crawler based on Python asynchronous, can act on multiple image websites that use Danbooru as the backend

This document is available under Chinese

GitHub license GitHub issues GitHub stars GitHub forks Python Version

Advantages

Universal

  • Currently supports the following multiple sites
  • This program considers access to other download interfaces from the beginning of the design, and only a small amount of code can add new site access

Efficient

This program uses Python's asynchronous programming features, which can maximize resource utilization

  • HTTP requests completely use httpx as an asynchronous to efficiently drive the program to run
  • On the author's own Visual Studio Codespace:
    • Running average download speed up to 20MiB/s in default configuration
    • The memory footprint is less than or equal to 200MiB

Reliable

Since the project was founded, pylance and mypy have been used for code type and code format checking, and pydantic has been used as a model for dynamic type verification.

Deployment

The deployment and use of this project is very simple:

Ready to work

  • Python3.8 or higher

  • Complete Python standard library

  • Save the project code locally

Installation dependencies

Open the project folder and execute the command line

pip insall -r requirements.txt

Run

python3 main.py

Configuration

For details, please see the comments in Configuration File