Python Crawler.initの例

プログラミング言語: Python

名前空間/パッケージ名: crawler

クラス/型: Crawler

メソッド/関数: __init__

hotexamples.comのコード掲載数: 36

Python Crawler.__init__ - 36件のコード例が見つかりました。すべてオープンソースプロジェクトから抽出されたPythonのcrawler.Crawler.__init__の実例で、最も評価が高いものを厳選しています。コード例の評価を行っていただくことで、より質の高いコード例が表示されるようになります。

よく使われるメソッド

表示非表示

Crawler(30)

__init__(27)

map(15)

assets_json(5)

info(4)

visit(3)

analyze(3)

_get_url_contents(3)

__subclasses__(3)

get_Torrents_List(2)

load(2)

download_Page_Files(2)

crawl_web(2)

reset(2)

response(2)

add_data(2)

_same_host(2)

_has_product(2)

return_soup(2)

dump(2)

AddURLs(2)

Grab(2)

Start(2)

ToggleTOR(2)

Update(2)

isValidUrl(1)

open_browser(1)

open(1)

mostrarConfig(1)

GetInfoNames(1)

GetInfoValues(1)

keepUrl(1)

isAlive(1)

poll(1)

insert_root(1)

GetTasks(1)

headers(1)

get_top_news(1)

get_result(1)

get_records(1)

get_pagelist(1)

get_headers(1)

get_forms(1)

output_csv(1)

recuperarInf(1)

post(1)

scrape_registrations(1)

submit(1)

silent(1)

show_imagelist(1)

コード例 #1

ファイルを表示

ファイル: komm.py プロジェクト: vsavkina/news-crawler

    def __init__(self, shelf, pswd):

        Crawler.__init__(self, pswd)

        self.respage = Resultpage()

        self.article = Article()

        self.storage = shelf['komm']

        self.dateend = '26.03.2017' if self.storage['end_reached'][
            self.storage['politNum']] else self.storage['dateEnd']

        self.politicians = ('%ED%E8%EA%EE%EB%FF+%F1%E0%F0%EA%EE%E7%E8',
                            '%F4%F0%E0%ED%F1%F3%E0+%EE%EB%EB%E0%ED%E4',
                            '%E4%EC%E8%F2%F0%E8%E9+%EC%E5%E4%E2%E5%E4%E5%E2',
                            '%E4%FD%E2%E8%E4+%EA%FD%EC%E5%F0%EE%ED',
                            '%E2%EB%E0%E4%E8%EC%E8%F0+%EF%F3%F2%E8%ED',
                            '%E0%ED%E3%E5%EB%E0+%EC%E5%F0%EA%E5%EB%FC',
                            '%F2%E5%F0%E5%E7%E0+%EC%FD%E9')

        #self.politicians = ('николя саркози', 'франсуа олланд', 'дмитрий медведев', 'дэвид кэмерон', 'владимир путин', 'ангела меркель', 'тереза мэй')

        self.data_format = '%Y-%m-%d'

        self.starting_page = 1

        self.payload = {}

        self.update_payload()

コード例 #2

ファイルを表示

 def __init__(self):
     Crawler.__init__(self)
     self.link_crawler = None
     self.url = 'https://www.instagram.com'
     
     ##data디렉토리 및 파일 생성
     self.create_data_storage()
     
     ##로그설정
     Crawler.set_logs('Instagram_Crawler_log','./logging/logfile_instagram.log')

コード例 #3

ファイルを表示

ファイル: yummly_crawler.py プロジェクト: quantumlicht/menu_builder

	def __init__(self, auth={}, urls={}, force_sync=False, config={}, api_limit=0):
		Crawler.__init__(self, auth, urls, force_sync, config, api_limit)
		self._type = config['fetch_by_type']
		self._filter = config['filter_key']

		self._count_cfg = Config(storage=self._config_strategy, type='counts')
		self._offset_cfg = Config(storage=self._config_strategy, type='offsets')

		self._MAX_RESULT_PER_TARGET = 0
		self._recipe_factory = RecipeFactory(connector=self._data_get_connector, storage=self._data_strategy)

コード例 #4

ファイルを表示

 def __init__(self, proj_name):
     Crawler.__init__(self, proj_name)
     self.name = "雷锋网"
     self.root_url = "http://www.leiphone.com"
     self.headers = {
         'Host': 'www.leiphone.com',
         'User-Agent':
         'Mozilla/5.0 (Windows NT 6.2; rv:16.0) Gecko/20100101 Firefox/16.0',
         'Accept':
         'text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8',
         'Connection': 'keep-alive'
     }

コード例 #5

ファイルを表示

ファイル: musinsaCrawler.py プロジェクト: wooojini/NLP-TF-IDF-SNS-trend-analysis

    def __init__(self):
        Crawler.__init__(self)
        self.content_crawler = None
        ##상품평순
        ##self.url = "https://store.musinsa.com/app/product/search?search_type=1&pre_q=&d_cat_cd=&brand=&rate=&page_kind=search&list_kind=small&sort=emt_high&page=%s&display_cnt=120&sale_goods=&ex_soldout=&color=&popup=&q=%s&price1=&price2="

        ##낮은가격순
        self.url = "https://store.musinsa.com/app/product/search?search_type=1&pre_q=&d_cat_cd=&brand=&rate=&page_kind=search&list_kind=small&sort=price_low&page=%s&display_cnt=120&sale_goods=&ex_soldout=&color=&popup=&chk_research=&q=%s&chk_brand=&price1=&price2=&chk_color=&chk_sale=&chk_soldout="
        self.content_url = "https://store.musinsa.com"

        ##data디렉토리 및 파일 생성
        self.create_data_storage()

        ##로그설정
        Crawler.set_logs('Musinsa_Crawler_log',
                         './logging/logfile_musinsa.log')

コード例 #6

ファイルを表示

ファイル: winha.py プロジェクト: lebseu/VAMK.help

    def __init__(self, student_id, password):
        """Constructor for getting student id and password.
        """

        # Initializing the base class Crawler.
        Crawler.__init__(self)

        self.student_id = student_id
        self.password = password

        # Structuring the authentication data into a dict for posting to the server.
        self.auth_data = {'dfUsernameHidden': student_id,
                          'dfPasswordHidden': password}

        # Login the website then other requests can be made with this session and getting the status of login.
        self.status = self.login()

コード例 #7

ファイルを表示

ファイル: suedd.py プロジェクト: vsavkina/news-crawler

    def __init__(self, shelf, pswd=None):

        Crawler.__init__(self, pswd)

        self.respage = Resultpage()

        self.article = Article()

        self.storage = shelf['suedd']

        self.politicians = ('sarkozy', 'hollande', 'medwedew', 'cameron',
                            'putin', 'merkel', 'theresa+AND+may')

        self.starting_page = 1

        self.update_payload()

        self.data_format = '%d.%m.%Y'

コード例 #8

ファイルを表示

ファイル: berliner_zeitung.py プロジェクト: vsavkina/news-crawler

    def __init__(self, shelf):

        Crawler.__init__(self)

        self.respage = Resultpage()

        self.article = Article()

        self.storage = shelf['ksta_de']

        self.politicians = ('sarkozy', 'hollande', 'dmitri|dmitrij+medwedew', 'david+cameron', 'putin', 'merkel'. 'theresa+may')

        self.site = r'http://www.berliner-zeitung.de/action/berliner-zeitung/4484314/search?'

        self.starting_page = 0

        self.data_format = '%Y-%m-%d'

        self.update_payload()

コード例 #9

ファイルを表示

ファイル: vz.py プロジェクト: vsavkina/news-crawler

    def __init__(self, shelf, pswd):

        Crawler.__init__(self, pswd)

        self.respage = Resultpage()

        self.article = Article()

        self.storage = shelf['vz']

        self.politicians = ('%ED%E8%EA%EE%EB%FF+%F1%E0%F0%EA%EE%E7%E8','%F4%F0%E0%ED%F1%F3%E0+%EE%EB%EB%E0%ED%E4','%E4%EC%E8%F2%F0%E8%E9+%EC%E5%E4%E2%E5%E4%E5%E2', '%E4%FD%E2%E8%E4+%EA%FD%EC%E5%F0%EE%ED', '%E2%EB%E0%E4%E8%EC%E8%F0+%EF%F3%F2%E8%ED', '%E0%ED%E3%E5%EB%E0+%EC%E5%F0%EA%E5%EB%FC', '%F2%E5%F0%E5%E7%E0+%EC%FD%E9')

        self.data_format = '%Y-%m-%d'

        self.starting_page = 1

        self.payload = {}

        self.update_payload()

コード例 #10

ファイルを表示

    def __init__(self, shelf, pswd):

        Crawler.__init__(self, pswd)

        self.respage = Resultpage()

        self.article = Article()

        self.storage = shelf['lemonde']

        self.politicians = ('nicolas sarkozy', 'francois hollande',
                            'dmitry medvedev', 'david cameron',
                            'vladimir putin', 'angela merkel', 'theresa may')

        self.starting_page = 1

        self.data_format = '%Y-%m-%d'

        self.site = r'http://www.lemonde.fr/recherche/?operator=and&exclude_keywords=&qt=recherche_texte_titre&author=&period=custom_date&start_day=01&start_month=01&start_year=2000&end_day=28&end_month=03&end_year=2017&sort=desc'.format(
            self.politicians[self.storage['politNum']])

コード例 #11

ファイルを表示

    def __init__(self, shelf, pswd):

        Crawler.__init__(self, pswd)

        self.respage = Resultpage()

        self.article = Article()

        self.storage = shelf['ksta_de']

        self.politicians = ('sarkozy', 'hollande', 'medwedew', 'cameron',
                            'putin', 'merkel', 'theresa+may')

        self.site = r'http://www.ksta.de/action/ksta/4484314/search?'

        self.data_format = '%Y-%m-%d'

        self.starting_page = 0

        self.update_payload()

コード例 #12

ファイルを表示

    def __init__(self, shelf, pswd):

        Crawler.__init__(self, pswd)

        self.respage = Resultpage()

        self.article = Article()

        self.storage = shelf['independent']

        self.politicians = ('nicolas sarkozy', 'francois hollande',
                            'dmitry medvedev', 'david cameron',
                            'vladimir putin', 'angela merkel', 'theresa may')

        self.site = r'http://www.independent.co.uk/search/site/{}'.format(
            self.politicians[self.storage['politNum']])

        self.data_format = '%Y-%m-%d'

        self.starting_page = 0

        self.update_payload()

コード例 #13

ファイルを表示

ファイル: spiegel.py プロジェクト: vsavkina/news-crawler

    def __init__(self, shelf, pswd):

        Crawler.__init__(self, pswd)

        self.respage = Resultpage()

        self.article = Article()

        self.storage = shelf['spiegel']

        self.politicians = ('nicolas_sarkozy', 'francois_hollande',
                            'dmitrij_medwedew', 'david_cameron',
                            'wladimir_putin', 'angela_merkel', 'theresa_may')

        self.starting_page = 1

        self.data_format = '%d.%m.%Y'

        self.site = r'http://www.spiegel.de/thema/{}/dossierarchiv-{}.html'.format(
            self.politicians[self.storage['politNum']],
            max(self.starting_page, self.storage['pn']))

        self.payload = None

コード例 #14

ファイルを表示

    def __init__(self, shelf, pswd):

        Crawler.__init__(self, pswd)

        self.respage = Resultpage()

        self.article = Article()

        self.storage = shelf['guardian']

        self.politicians = ('nicolas-sarkozy', 'francois-hollande',
                            'dmitry-medvedev', 'davidcameron',
                            'vladimir-putin', 'angela-merkel', 'theresamay')

        self.local = (3, 6)

        self.starting_page = 1

        self.data_format = '%Y-%m-%d'

        self.site = r'https://www.theguardian.com/{}/{}?'.format(
            'world' if self.storage['politNum'] not in self.local else
            'politics', self.politicians[self.storage['politNum']])

コード例 #15

ファイルを表示

ファイル: tritonia.py プロジェクト: lebseu/VAMK.help

    def __init__(self, login_id=None, last_name=None, pin=None):
        """Constructor for getting login credentials.
        """

        # Initializing the base class Crawler.
        Crawler.__init__(self)

        # Structuring the authentication data into a dict for posting to the server.
        self.auth_data = {
            'loginType': 'B',
            'loginId': login_id,
            'lastName': last_name,
            'pin': pin,
            'page.logIn.library': '1@VYKDB20011102005217'
        }

        self.books = None
        self.content = ''

        # Login the website then other requests can be made with this session and getting the status of login.
        self.status = self.login()
        if self.status is True:
            self.books = self.get_books()

コード例 #16

ファイルを表示

ファイル: cn56110.py プロジェクト: lehman3087/LogisticsPlatform

 def __init__(self):
     Crawler.__init__(self)
     self.HOST = "http://56110.cn"
     self.suffix = "/Huo/list.html"

コード例 #17

ファイルを表示

ファイル: musicCrawler.py プロジェクト: Abbeychenxi/163Music

 def __init__(self, start_url=START_URL):
     Crawler.__init__(self, start_url)
     self.tasks = []

コード例 #18

ファイルを表示

 def __init__(self):
     Crawler.__init__(self)
     self.HOST = "http://wb.56888.net"
     self.prefix = "/OutSourceList.aspx?tendertype=4&p="

コード例 #19

ファイルを表示

 def __init__(self):
     Crawler.__init__(self)
     self.url = 'http://www.google.com/search'
     self.params = {"tbs": "li:1"}

コード例 #20

ファイルを表示

ファイル: theatre_data_crawler.py プロジェクト: tillmd/velib

 def __init__(self, config):
     Crawler.__init__(self, config)

コード例 #21

ファイルを表示

ファイル: crawler_huxiu.py プロジェクト: cash2one/news-classification-system

 def __init__(self, proj_name):
     Crawler.__init__(self, proj_name)
     self.name = "虎嗅"
     self.root_url = "http://www.huxiu.com"

コード例 #22

ファイルを表示

 def __init__(self):
     Crawler.__init__(self)
     self.HOST = "http://fala56.com"
     self.prefix = "/Views/Huoyuan"
     self.suffix = "/GoodsLandList.aspx?area=-1"

コード例 #23

ファイルを表示

ファイル: crawler_36kr.py プロジェクト: cash2one/news-classification-system

 def __init__(self, proj_name):
     Crawler.__init__(self, proj_name)
     self.name = "36氪"
     self.root_url = "http://36kr.com"

コード例 #24

ファイルを表示

 def __init__(self):
     Crawler.__init__(self)
     self.HOST = "http://www.chinawutong.com"
     self.prefix = "/103.html?pid="

コード例 #25

ファイルを表示

 def __init__(self, proj_name):
     Crawler.__init__(self, proj_name)
     self.name = "网易科技"
     self.root_url = "http://tech.163.com/gd/"

コード例 #26

ファイルを表示

ファイル: crawler_geek.py プロジェクト: cash2one/news-classification-system

 def __init__(self, proj_name):
     Crawler.__init__(self, proj_name)
     self.name = "极客公园"
     self.root_url = "http://www.geekpark.net"

コード例 #27

ファイルを表示

 def __init__(self, proj_name):
     Crawler.__init__(self, proj_name)
     self.name = "砍柴网"
     self.root_url = "http://www.ikanchai.com/"

コード例 #28

ファイルを表示

ファイル: com51yunli.py プロジェクト: reference-project/LogisticsPlatform

 def __init__(self):
     Crawler.__init__(self)
     self.HOST = "http://www.51yunli.com"
     self.prefix = "/goods/0/0/"
     self.suffix = "/0"
     self.MAX_PAGE = 7

コード例 #29

ファイルを表示

ファイル: comchinawutong.py プロジェクト: lehman3087/LogisticsPlatform

 def __init__(self):
     Crawler.__init__(self)
     self.HOST = "http://www.chinawutong.com"
     self.prefix = "/103.html?pid="

コード例 #30

ファイルを表示

 def __init__(self):
     Crawler.__init__(self)
     self.HOST = "http://56110.cn"
     self.suffix = "/Huo/list.html"

コード例 #31

ファイルを表示

ファイル: com8glw.py プロジェクト: reference-project/LogisticsPlatform

 def __init__(self):
     Crawler.__init__(self)
     self.HOST = "http://www.8glw.com"
     self.prefix = "/main_info.asp?id=1&page="

コード例 #32

ファイルを表示

ファイル: calendarCrawler.py プロジェクト: arpitkhanuja/UIUC_events

	def __init__(self, dbname=""):
		Crawler.__init__(self, dbname)

コード例 #33

ファイルを表示

ファイル: cn0256.py プロジェクト: lehman3087/LogisticsPlatform

 def __init__(self):
     Crawler.__init__(self)
     self.HOST = "http://www.0256.cn"
     self.prefix = "/goods/?PageIndex="

コード例 #34

ファイルを表示

ファイル: comfala56.py プロジェクト: lehman3087/LogisticsPlatform

 def __init__(self):
     Crawler.__init__(self)
     self.HOST = "http://fala56.com"
     self.prefix = "/Views/Huoyuan"
     self.suffix = "/GoodsLandList.aspx?area=-1"

コード例 #35

ファイルを表示

ファイル: net56888.py プロジェクト: lehman3087/LogisticsPlatform

 def __init__(self):
     Crawler.__init__(self)
     self.HOST = "http://wb.56888.net"
     self.prefix = "/OutSourceList.aspx?tendertype=4&p="

コード例 #36

ファイルを表示

ファイル: cn0256.py プロジェクト: reference-project/LogisticsPlatform

 def __init__(self):
     Crawler.__init__(self)
     self.HOST = "http://www.0256.cn"
     self.prefix = "/goods/?PageIndex="

Python Crawler.__init__の例

Python Crawler.initの例