Python HtmlAnalyse.post_download示例

编程语言: Python

命名空间/包名称: Lib.NetCrawl.HtmlAnalyse

类/类型: HtmlAnalyse

方法/功能: post_download

hotexamples.com的示例: 2

Python HtmlAnalyse.post_download - 已找到2个示例。这些是从开源项目中提取的最受好评的Lib.NetCrawl.HtmlAnalyse.HtmlAnalyse.post_download现实Python示例。您可以评价示例，以帮助我们提高示例质量。

常用方法

显示隐藏

get_bs_contents(30)

HtmlAnalyse(10)

download(6)

get_contents(3)

post_contents(3)

post_download(2)

示例#1

显示文件

文件： pdf_download1.py 项目： RoyalClown/MyPython

    def download(self, pdf_url):
        content_list = re.match(r'downloadLinkClick\((.*?)\);return false',
                                a).group(1).split(",")
        filename = content_list[0].replace("'", "")

        url = "http://ds.yuden.co.jp/TYCOMPAS/cs/detail.do?mode=download&fileName=" + filename

        isSeriesData = content_list[1]
        isProductsData = content_list[2]
        isProductsDataGraph = content_list[3]
        DownloadForm = {
            "action": "detail.do",
            "classificationID": "AE",
            "fileName": filename,
            "isSeriesData": isSeriesData,
            "isProductsData": isProductsData,
            "isProductsDataGraph": isProductsDataGraph
        }
        html_analyse = HtmlAnalyse(url)
        html_analyse.post_download(
            data=DownloadForm,
            path="I:\PythonPrj\StandardSpider\DataAnalyse\\NewRules\\a.pdf")

        filename = self.path + str(random.random()) + '.pdf'
        try:
            html_analyse = HtmlAnalyse(url, proxy=self.proxy_ip)
            html_analyse.download(filename)
            print("下载完成。。。")
        except Exception as e:
            print(e)
            self.proxy_pool.remove(self.proxy_ip)
            self.proxy_ip = self.proxy_pool.get()
            self.download(pdf_url)

        return filename

示例#2

显示文件

文件： pdf_download1.py 项目： RoyalClown/MyPython

        threading_pool = ThreadingPool()
        threading_pool.multi_thread(thread, pdf_urls)


if __name__ == "__main__":
    # pdfdownload = PdfDownload(task_code="CCT2016120900000001")
    #
    # pdfdownload.go()

    a = "downloadLinkClick('E-HTQ_e.pdf',true,false,false);return false"
    content_list = re.match(r'downloadLinkClick\((.*?)\);return false',
                            a).group(1).split(",")
    filename = content_list[0].replace("'", "")

    url = "http://ds.yuden.co.jp/TYCOMPAS/cs/detail.do?mode=download&fileName=" + filename

    isSeriesData = content_list[1]
    isProductsData = content_list[2]
    isProductsDataGraph = content_list[3]
    DownloadForm = {
        "action": "detail.do",
        "classificationID": "AE",
        "fileName": filename,
        "isSeriesData": isSeriesData,
        "isProductsData": isProductsData,
        "isProductsDataGraph": isProductsDataGraph
    }
    html_analyse = HtmlAnalyse(url)
    html_analyse.post_download(
        data=DownloadForm,
        path="I:\PythonPrj\StandardSpider\DataAnalyse\\NewRules\\a.pdf")