Python extract 예제들

프로그래밍 언어: Python

네임스페이스/패키지 이름: e01extract

메소드/함수: extract

hotexamples.com에서의 예제들: 5

Python extract - 5개의 예제가 발견되었습니다. 이것들은 오픈소스 프로젝트에서 추출된 Python의 e01extract.extract에 대한 실세계 최고 등급의 예제들입니다. 예제들을 평가하여 예제의 품질 향상에 도움을 줄 수 있습니다.

예제 #1

파일 보기

파일: e02crawl.py 프로젝트: jeromeku/pycon2014

def extract_multi(to_fetch, seen_urls):
    results = []
    for url in to_fetch:
        if url in seen_urls: continue
        seen_urls.add(url)
        try:
            results.append(extract(url))
        except Exception:
            continue
    return results

예제 #2

파일 보기

파일: e02crawl.py 프로젝트: bslatkin/pycon2014

def extract_multi(to_fetch, seen_urls):
    results = []
    for url in to_fetch:
        if url in seen_urls: continue
        seen_urls.add(url)
        try:
            results.append(extract(url))
        except Exception:
            continue
    return results

예제 #3

파일 보기

def fetcher(fetch_queue, output_queue):
    logging.info('Starting fetcher thread')
    while True:
        state, depth, url = fetch_queue.get()
        logging.info('%s: Fetching in thread: %s', id(state), url)
        try:
            try:
                _, data, found_urls = extract(url)
            except Exception:
                data, found_urls = None, []

            output_queue.put(FetchResult(state, depth, url, data, found_urls))
        finally:
            fetch_queue.task_done()

예제 #4

파일 보기

파일: e04twostage.py 프로젝트: jeromeku/pycon2014

def fetcher(fetch_queue, max_depth, seen_urls, output_queue):
    while True:
        depth, url = fetch_queue.get()
        try:
            if depth > max_depth: continue  # Ignore URLs that are too deep
            if url in seen_urls: continue   # Prevent infinite loops

            seen_urls.add(url)              # GIL :/
            try:
                _, data, found_urls = extract(url)
            except Exception:
                continue

            output_queue.put((url, data))
            for found in found_urls:
                fetch_queue.put((depth + 1, found))
        finally:
            fetch_queue.task_done()

예제 #5

파일 보기

파일: e03parallel.py 프로젝트: jeromeku/pycon2014

def consumer(fetch_queue, max_depth, seen_urls, result):
    while True:
        depth, url = fetch_queue.get()
        try:
            if depth > max_depth: continue
            if url in seen_urls: continue      # GIL :|

            seen_urls.add(url)                 # GIL :/
            try:
                _, data, found_urls = extract(url)
            except Exception:
                continue

            result.append((depth, url, data))  # GIL :(
            for found in found_urls:
                fetch_queue.put((depth + 1, found))
        finally:
            fetch_queue.task_done()