Python Status 예제들

프로그래밍 언어: Python

네임스페이스/패키지 이름: sci_common

클래스/타입: Status

hotexamples.com에서의 예제들: 2

Python Status - 2개의 예제가 발견되었습니다. 이것들은 오픈소스 프로젝트에서 추출된 Python의 sci_common.Status에 대한 실세계 최고 등급의 예제들입니다. 예제들을 평가하여 예제의 품질 향상에 도움을 줄 수 있습니다.

자주 사용되는 메소드들

보기 숨기기

query_index(2)

__str__(1)

index(1)

query(1)

reset(1)

resultCount(1)

예제 #1

파일 보기

파일: sci_clawer.py 프로젝트: irgb/SCICrawler

console = logging.StreamHandler()
console.setLevel(logging.WARNING)
console.setFormatter(formatter)
logging.getLogger('').addHandler(console)
###--------------main-----------------###

driver = webdriver.Chrome()
# driver = webdriver.Remote("http:localhost:4444/wd/hub", webdriver.DesiredCapabilities.CHROME.copy())
filterPath = 'sci.bloom_filter'
bf = BloomFilter.open(filterPath) if isfile(filterPath) else BloomFilter(1000000, 0.001, filterPath)
logging.info('bloom filter loaded')
#将paper信息保存在paperInfo对象中
paperInfo = PaperInfo()

#status用于记录当前状态
status = Status()
statusPath = 'sci.status'
if isfile(statusPath) :
    status = pkl.load(open(statusPath,'r'))
    logging.info('status loaded')
    logging.warning('current status: ' + status.__str__())

#定义是否倒序爬取
reverse = True
index_range = []
if not reverse : 
    index_range = range(status.query_index , len(querywords))
else:
    if status.query_index == 0: status.query_index=len(querywords)-1
    index_range = range(status.query_index , -1  ,-1)
#begin crawler

예제 #2

파일 보기

파일: sci_clawer.py 프로젝트: waleking/SCICrawler

formatter = logging.Formatter('%(asctime)s, %(filename)s:%(lineno)d, %(levelname)s: %(message)s')
console = logging.StreamHandler()
console.setLevel(logging.WARNING)
console.setFormatter(formatter)
logging.getLogger('').addHandler(console)
###--------------main-----------------###

driver = webdriver.Chrome()

filterPath = 'sci.bloom_filter'
bf = BloomFilter.open(filterPath) if isfile(filterPath) else BloomFilter(1000000, 0.001, filterPath)
logging.info('bloom filter loaded')
#将paper信息保存在paperInfo对象中
paperInfo = PaperInfo()
#status用于记录当前状态
status = Status()
statusPath = 'sci.status'
if isfile(statusPath) :
    status = pkl.load(open(statusPath,'r'))
    logging.info('status loaded')
try:
    #从当前query_index位置开始
    for i in range(status.query_index , len(querywords)):
        #初始化这次query的状态
        status.reset()
        query = querywords[i]
        status.query_index = i ; status.query = query
        #count 用于记录每个query爬取的论文数量，每个query最多爬取100篇
        count = 0
        logging.info('current query:'+'index = '+ str(i) + 'keyword = '+query)
        driver.get('http://apps.webofknowledge.com/')