Python Spider.setworkdir Examples

Programming Language: Python

Namespace/Package Name: spider

Class/Type: Spider

Method/Function: setworkdir

Examples at hotexamples.com: 1

Python Spider.setworkdir - 1 examples found. These are the top rated real world Python examples of spider.Spider.setworkdir extracted from open source projects. You can rate examples to help us improve the quality of examples.

Frequently Used Methods

Show Hide

Spider(30)

crawl_page(30)

crawl(14)

__init__(8)

craw(4)

Search(4)

crawl_genre(3)

build_node(3)

analyse(3)

process_page(2)

court(2)

add_url(2)

content_list(2)

GetInfo(2)

crowl(1)

crowl_page(1)

GET(1)

crawled_page(1)

createResultExcel(1)

get2l_url(1)

crawledPage(1)

crawle_page_in_queue(1)

crawl_weather(1)

crawl_video_urls(1)

crawl_robots(1)

data(1)

getfilename(1)

get3l_url(1)

post(1)

update(1)

startCrawl(1)

setworkdir(1)

setfilename(1)

setDaemon(1)

responseCallback(1)

parse_blog(1)

getSoup(1)

linkCallback(1)

levelCallback(1)

is_valid(1)

is_outgoing(1)

htmlCallback(1)

get_pdfs(1)

crawl_page_graph(1)

crawl_async_slots(1)

crawl_next_page_from_queue(1)

authorized(1)

Process(1)

ReturnValues(1)

Text(1)

Example #1

Show file

File: xzowner.py Project: gccccc/spider

#!/usr/bin/env python
# coding=utf-8
from spider import Spider
spider = Spider()
spider.setworkdir('/data/work/ys/oriinfo/ownerinfo/')
spider.setfilename('owneridlist.txt')
f = open(spider.getfilename(),'r+')
while True:
    dic = {}
    dic['diary'] = dic['information'] = dic['allComments'] = dic['order'] = {}
    line = f.readline()
    if not line:
        break
    line = line[:-1]
    print line
    soup = spider.getSoup('http://www.xiaozhu.com/fangdong/' + line + '/pinglun.html')
    ul = soup.find('ul',{'class':'comment_right'})
    dic['allComments']['rate'] = {}
    item = ['sanitationRate','descriptionRate','performanceRate','securityRate','locationRate']
    if ul == None:
        dic['nohtml'] = True
        for i in item:
            dic['allComments']['rate'][i] = 'NULL'
        dic['allComments']['rate']['allcommentRate'] = 'NULL'
    else:
        dic['nohtml'] = False
        liAll = ul.findAll('li')
        cot = 0
        for li in liAll:
            print li
            grade = li.find('span').find('em').get('value')