Python Spider.comic_search Examples

Programming Language: Python

Namespace/Package Name: spider

Class/Type: Spider

Method/Function: comic_search

Examples at hotexamples.com: 1

Python Spider.comic_search - 1 examples found. These are the top rated real world Python examples of spider.Spider.comic_search extracted from open source projects. You can rate examples to help us improve the quality of examples.

Frequently Used Methods

Show Hide

Spider(30)

crawl_page(30)

crawl(14)

__init__(8)

craw(4)

Search(4)

crawl_genre(3)

build_node(3)

analyse(3)

process_page(2)

court(2)

add_url(2)

content_list(2)

GetInfo(2)

crowl(1)

crowl_page(1)

GET(1)

crawled_page(1)

createResultExcel(1)

get2l_url(1)

crawledPage(1)

crawle_page_in_queue(1)

crawl_weather(1)

crawl_video_urls(1)

crawl_robots(1)

data(1)

getfilename(1)

get3l_url(1)

post(1)

update(1)

startCrawl(1)

setworkdir(1)

setfilename(1)

setDaemon(1)

responseCallback(1)

parse_blog(1)

getSoup(1)

linkCallback(1)

levelCallback(1)

is_valid(1)

is_outgoing(1)

htmlCallback(1)

get_pdfs(1)

crawl_page_graph(1)

crawl_async_slots(1)

crawl_next_page_from_queue(1)

authorized(1)

Process(1)

ReturnValues(1)

Text(1)

Example #1

Show file

File: spiderInDB.py Project: coldrainf/ACSpider

import queue
import pymysql
from spider import Spider

#预置爬取的漫画总数为 258*20
total = 258
sql_insert = """insert into comic_info values ('%d',"%s","%s","%s","%s","%s","%s", NULL, NULL, NULL, NULL, NULL, NULL)"""
sql_select = """select cslug from comic_info where cid = '%d' and clastname != "%s" """
sql_update = """update comic_info set clastname = "%s", cserialise = '%d' where cid = '%d'"""
sql_update2 = """update comic_info set ctype = "%s", ccategory = "%s", carea = "%s", cupdate = "%s", cchapters = "%s", cchapterurl = "%s" where cslug = '%s'"""
threadList = ["Thread-1", "Thread-2", "Thread-3", "Thread-4"]

sp = Spider()
#尝试获取总页数
try:
    total = sp.comic_search('', '1')['_meta']['pageCount'] + 1
except:
    print('查找总页数出错')

workQueue = queue.Queue(total * 21)
# 用页码填充队列
for page in range(1, total):
    workQueue.put(page)
spiderUrls = []
threading.TIMEOUT_MAX = 10


#设置线程
class myThread(threading.Thread):
    def __init__(self, name, q, flag):
        threading.Thread.__init__(self)