Python Crawler.company_infoの例

プログラミング言語: Python

名前空間/パッケージ名: Crawler

クラス/型: Crawler

メソッド/関数: company_info

hotexamples.comのコード掲載数: 1

Python Crawler.company_info - 1件のコード例が見つかりました。すべてオープンソースプロジェクトから抽出されたPythonのCrawler.Crawler.company_infoの実例で、最も評価が高いものを厳選しています。コード例の評価を行っていただくことで、より質の高いコード例が表示されるようになります。

よく使われるメソッド

表示非表示

Crawler(30)

crawl(15)

click(5)

close(4)

crawl_native(4)

getPage(3)

_process_next_url(2)

crawl_and_createfile(2)

add_to_dirlist(2)

crawl_multithread(2)

_process_html_link(2)

_process_html_asset(2)

_process_html(2)

save_crawler_data(2)

save_lists(2)

_make_request(2)

__init__(2)

render_sitemap(2)

crawling_process(1)

create_file(1)

create_view(1)

getCurrentPage(1)

getLinkStructure(1)

crawling(1)

crawl_own_albums(1)

Crawl(1)

getNextPage(1)

getPage2(1)

getTreeIndex(1)

getVisited(1)

hasNext(1)

join(1)

loadConf(1)

printLinkStructure(1)

process_q(1)

startCrawl(1)

startCrawling(1)

go(1)

crawl_index(1)

crawl_one(1)

baidu_search(1)

SLEEP_TIME(1)

URL_LIMIT(1)

_normalize_url(1)

_parse_url(1)

add(1)

addNewWorks(1)

add_target_full_profile(1)

add_target_short_profile(1)

all(1)

コード例 #1

ファイルを表示

def CompanyInfoThread(category):
    # 建立SQL連線
    conn_cfg = {'host': '', 'user': '', 'password': '', 'db': ''}
    conn = pymysql.connect(**conn_cfg)
    cursor = conn.cursor()
    # 原sql指令為Like '_____' %，在python最後要打兩個%，只打一個無法運作
    sql = "select distinct `公司連結` from db_104.job_link where `職類編號` LIKE '%s%%'" % category
    cursor.execute(sql)
    companylink = cursor.fetchall()
    companylink = [i[0] for i in companylink]  # 建立職缺連結list
    conn.close()

    # 依序讀取公司連結，將公司資訊存成json
    for l in companylink:
        try:
            url = "https://" + l
            content = Crawler.company_info(url)
            dn = "[directory]"
            fn = url.split("company/")[1].split("?")[0]
            if not os.path.exists(dn):
                os.makedirs(dn)
            f = open(dn + fn + ".json", "w", encoding="utf-8")
            json.dump(content, f)
            f.close()
            print(url)
            print("職務類型", category, "的公司:", fn, "complete")

        except:
            pass