Python WebParser.get_data_with_tags Examples

Programming Language: Python

Namespace/Package Name: web_parser

Class/Type: WebParser

Method/Function: get_data_with_tags

Examples at hotexamples.com: 2

Python WebParser.get_data_with_tags - 2 examples found. These are the top rated real world Python examples of web_parser.WebParser.get_data_with_tags extracted from open source projects. You can rate examples to help us improve the quality of examples.

Frequently Used Methods

Show Hide

WebParser(5)

feed(2)

get_data_with_tags(2)

download_zipfile(1)

execute_command(1)

get_laneinfo(1)

Example #1

Show file

    def collect_links_and_data(self, page_url):

        # Fixes ssl issue for some mac users
        if (not os.environ.get('PYTHONHTTPSVERIFY', '')
                and getattr(ssl, '_create_unverified_context', None)):
            ssl._create_default_https_context = ssl._create_unverified_context
        try:
            html_string = ""
            response = urlopen(page_url)
            if "text/html" in response.getheader(
                    "Content-Type"):  # Check to see if HTML response
                html_bytes = response.read()  # Read the bytestream in response
                html_string = html_bytes.decode(
                    "utf-8")  # Decode bytestream as utf-8

            parser = WebParser(
                self.base_url
            )  # Initialise custom webparser with html response
            parser.feed(html_string)  # Execute parser
            self.data_list = parser.get_data_with_tags(
            )  # Retrieve datalist from parser
        except Exception as e:
            print("Error: " + str(e))
            print("Program will terminate")
            sys.exit()
        return parser.get_page_urls()

Example #2

Show file

File: scraper.py Project: Wulff-1996/web_scraper_python_exam

    def collect_links_and_data(self, page_url):
        try:
            html_string = ""
            response = urlopen(page_url)
            if "text/html" in response.getheader(
                    "Content-Type"):  # Check to see if HTML response
                html_bytes = response.read()  # Read the bytestream in response
                html_string = html_bytes.decode(
                    "utf-8")  # Decode bytestream as utf-8

            parser = WebParser(
                self.base_url
            )  # Initialise custom webparser with html response
            parser.feed(html_string)  # Execute parser
            self.data_list = parser.get_data_with_tags(
            )  # Retrieve datalist from parser
        except Exception as e:
            print("Error: " + str(e))
            print("Program will terminate")
            sys.exit()
        return parser.get_page_urls()