Skip to content

xlelou/fp-server

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

fp-server

Free proxy server based on Tornado and Scrapy.

Build your own proxy pool!

Features:

  • continuesly crawling and providing free proxy
  • with friendly and easy-to-use HTTP api
  • asynchronous and high-perfermance
  • support high con-concurrent
  • automatically check proxy in cycle and ditch unavailable ones

基于Tornado和Scrapy的免费代理服务器

  • 持续爬取新代理
  • 易用的HTTP api
  • 异步,支持高并发
  • 定时检测代理可用性,自动更新

中文文档正在写…… _(:ι」∠)_

This project has been tested on:

  • Archlinux; Python-3.6.5
  • Debian(wsl); Python-3.5.3

And it doesn't support Windows for now...

Contents

Get started

  1. Install base requirements: python>=3.5(I use Python-3.6.5) redis
  2. Clone this repo.
  3. Install python packages by:
pip install -r requirements.txt
  1. Read the config and modify it according to your need.
  2. Start the server:
python ./src/main.py
  1. Then use the APIs to get proxies.

web APIs

typical response:

{
    "code": 0,
    "msg": "ok",
    "data": {
        ...
    }
}
  • code: result of event (not http code), 0 for sucess
  • msg: message for failed event
  • data: detail for sucessful event

get proxies

GET /api/proxy/
params Must/
Optional
detail default
count O the number of proxies you need 1
scheme O choices:HTTP HTTPS both*
anonymity O choices:transparent anonymous both
(TODO)
sort_by_speed
O choices:
1: desending order
0: no order
-1: ascending order
0
  • both: include all type, not grouped

example

  • To acquire 10 proxies in HTTP scheme with anonymity:
    GET /api/proxy/?num=10&scheme=HTTP&anonymity=anonymous
    
    The response:
    {
        "code": 0,
        "msg": "ok",
        "data": {
            "count": 9,
            "items": [
            {
                "port": 2000,
                "ip": "xxx.xxx.xx.xxx",
                "scheme": "HTTP",
                "url": "http://xxx.xxx.xxx.xx:xxxx",
                "anonymity": "transparent"
            },
            ...
            ]
        }
    }

screenshot

check status

Check server status. Include:

  • Running spiders
  • Stored proxies
GET /api/status/

No params.

screenshot

Config

Path: {repo}/src/config/common.py

  • HTTP_PORT decide which http port to run on (default: 12345)
  • CONSOLE_OUTPUT if set to 1, the server will print log to console other than file (default: 1)
  • LOG log config, including:
    • level dir and filename, logging to file requires CONSOLE_OUTPUT = 0
  • REDIS redis database config, including:
    • host port db
  • PROXY_STORE_NUM the number of proxy you need (default 500)
    • After reached this number, the crawler stopped crawling new proxies.
    • Set it depending on your need.
  • PROXY_STORE_CHECK_SEC every proxy will be checked in period
    • It's for each single proxy, not the checker spider.

Source webs

Growing……

Supporting:

Bugs and feature requests

I need your feedback to make it better.
Please create an issue for any problems or advice.

Known bugs:

  • Many wierd None... thought relavant to insecure thread
  • Block while using Tornado-4.5.3

TODOs

  • Divide log module
  • More detailed api
  • Bring in Docker
  • Web frontend via bootstrap
  • Add user-agent pool

About

Free proxy server, continuesly crawling and providing proxies, based on Tornado and Scrapy.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 100.0%