ipdb_creator

从网络上爬出ip库的工具

#文件说明

query.py: 最基本的组件,定义了一个query_ip函数，可以传递ip进行,通过淘宝的接口查询ip数据。
build_rtree.py: 定义了ipRadixDB类，该类主要是对radix的一些功能进行封装和外部数据交互的封装。
fully_update_cn.py:生成国内ip数据库，运行fully_update_cn.py会生成ip_data_cn_merged文件,该文件即淘宝上的国内ip数据信息。国内ip会扫描所有24网段。
fully_update_fn.py:生成国外ip数据库，运行fully_update_fn.py会生成ip_data_fn_merged文件,该文件即国外ip数据信息.国外ip只根据delegated文件中分配得简称来确定国家，如果没有国家简称，则通过taobao查询。
starter.py:启动器，会分别调用fully_update.py和fully_update_cn.py，最后通过合并ip_data_cn_merged和ip_data_fn_merged，得到ipdb.dat，就是完整的ip数据库结果。
delegated-*-latest:5个文件是ip分配组织提供的ip的数据集，需要先将这些文件进行更新.最新地址参看附录。
country_code:国家简称表
log.py: 日志打印

#执行说明

确保delegated-*-latest的文件已经更新到最新的版本。
确保output文件夹为空。
python已经安装py-radix，netaddr，requests这三个模块，可以用pip安装。
通过python starter.py启动，运行时间较长，如果要使用后台启动，自己加nohup。
大约需要10天左右才能跑完所有数据库。
TODO:如果有多个出口ip的条件，可以想办法优化查询速度。

附录

最新delegated文件地址：http://ftp.apnic.net/stats/

国家码地址：http://zh.wikipedia.org/wiki/%E5%9C%8B%E5%AE%B6%E4%BB%A3%E7%A2%BC

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
input		input
.gitignore		.gitignore
README.md		README.md
build_rtree.py		build_rtree.py
fully_update_cn.py		fully_update_cn.py
fully_update_fn.py		fully_update_fn.py
ipip.py		ipip.py
log.py		log.py
merge.py		merge.py
query.py		query.py
starter.py		starter.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

input

input

.gitignore

.gitignore

README.md

README.md

build_rtree.py

build_rtree.py

fully_update_cn.py

fully_update_cn.py

fully_update_fn.py

fully_update_fn.py

ipip.py

ipip.py

log.py

log.py

merge.py

merge.py

query.py

query.py

starter.py

starter.py

Repository files navigation

ipdb_creator

附录

About

Releases

Packages

Languages

licheng-xd/ipdb_creator

Folders and files

Latest commit

History

Repository files navigation

ipdb_creator

附录

About

Resources

Stars

Watchers

Forks

Languages