Ejemplos de WebCrawler.create_master en Python

Lenguaje de programación: Python

Namespace/Package Name: web_crawler

Clase / Tipo: WebCrawler

Método / Función: create_master

Ejemplos en hotexamples.com: 2

Python WebCrawler.create_master - 2 ejemplos encontrados. Estos son los ejemplos en Python del mundo real mejor valorados de web_crawler.WebCrawler.create_master extraídos de proyectos de código abierto. Puedes valorar ejemplos para ayudarnos a mejorar la calidad de los ejemplos.

Métodos usados con frecuencia

Mostrar Ocultar

WebCrawler(20)

create_worker(4)

create_master(2)

compute_ranks(1)

crawl_page(1)

crawl_web(1)

get_page_contents(1)

get_page_request_object(1)

get_products_urls(1)

is_outgoing(1)

prepare_link(1)

proc_chkupdate(1)

Ejemplo n.º 1

Mostrar archivo

Archivo: web_crawler_test_3_supersurfer.py Proyecto: OceanVision/ocean

The name of this test means that crawler will jump often to the distant
locations, increasing his depth quickly.
"""

import sys
import time

sys.path.append("../web_crawler")
from web_crawler import WebCrawler

sys.path.append("..")
from privileges import construct_full_privilege, privileges_bigger_or_equal


master_crawler = WebCrawler.create_master (
    privileges = construct_full_privilege(),
    start_url = "http://antyweb.pl/"
)


WebCrawler.create_worker (
    privileges = construct_full_privilege(),
    master = master_crawler,
    max_internal_expansion = 5,
    max_external_expansion = 3,
    max_crawling_depth = 100,
)

master_crawler.run()

time.sleep(60*60*24*3)
master_crawler.terminate()

Ejemplo n.º 2

Mostrar archivo

Archivo: web_crawler_exporter.py Proyecto: OceanVision/ocean

EXPORT_FILE = 'rss_feeds'

if len(sys.argv) == 1:
    print doc
    print 'Usage: python2 web_crawler_exporter.py [WEBSITE]'
    print 'where [WEBSITE] is a full url, for example: http://news.google.com'
    print 'See README.md for details.'

print 'Output will be APPENDED to file named ' + EXPORT_FILE + '\n'

if len(sys.argv) == 1:
    exit()

master_crawler = WebCrawler.create_master (
    privileges = construct_full_privilege(),
    start_url = str(sys.argv[1]),
)

WebCrawler.create_worker (
    privileges = construct_full_privilege(),
    master = master_crawler,
    max_external_expansion = 1000,
    max_internal_expansion = 4,
    max_crawling_depth = 3,
    list_export = True,
    export_dicts = True,
    export_file = EXPORT_FILE,
)

master_crawler.run()