Python Scraper.get_script Examples

Programming Language: Python

Namespace/Package Name: scraper

Class/Type: Scraper

Method/Function: get_script

Examples at hotexamples.com: 1

Python Scraper.get_script - 1 examples found. These are the top rated real world Python examples of scraper.Scraper.get_script extracted from open source projects. You can rate examples to help us improve the quality of examples.

Frequently Used Methods

Show Hide

Scraper(30)

matchTag(7)

connect(6)

__init__(5)

_time_now(5)

close(5)

submit(3)

find_docs(3)

get_children(3)

create_destination(2)

extractTag(2)

get_papers(2)

begin(2)

get_all_page_uris(1)

get_all_skills(1)

get_css(1)

get_and_write_records(1)

getZipLinks(1)

get_manga(1)

get_paths(1)

get_post_data_per_page(1)

get_all_manga(1)

getGameList(1)

getSlist(1)

getQlist(1)

getInformation(1)

getIndexhtm(1)

get_prices(1)

getEvents(1)

getDepts(1)

getAppList(1)

gather_reddit_data(1)

fetch_most_recent_transactions(1)

fetch_booster_usage(1)

extractText(1)

create_organization_sets(1)

create_http_link(1)

get_price(1)

DownloadImage(1)

get_script(1)

scrape_ingredients(1)

update_submission_content(1)

store_parse(1)

stopped(1)

sort(1)

seturldata(1)

set_started_callback(1)

set_output_file(1)

set_finished_callback(1)

set_broadcast_document_callback(1)

Example #1

Show file

File: crawler.py Project: newxan/netscrape

    def get(self):
        global total_data, crawl_count, crawled

        if crawl_count >= DEPTH_LIMIT:
            return False

        crawled.add(self.url)
        data = self.fetch()

        if data and data != bytearray(b' '):
            if total_data > CONTENT_LIMIT:
                return False

            total_data += len(data)
            crawl_count += 1
            webserver.save(self.url, self.root, self.type, data)

            s = Scraper(data, self.console)
            if self.type not in ["JS", "CSS"]:
                #css
                css_links = s.get_css()
                for link in css_links:
                    if link:
                        c = Crawler(link, self, "CSS", self.console)
                        if c.url not in crawled:
                            c.get()
                    else:
                        pass
            if self.type not in ["JS", "CSS"]:
                #js
                js_links = s.get_script()
                self.console.print(js_links)

                for link in js_links:
                    if link:
                        c = Crawler(link, self, "JS", self.console)
                        if c.url not in crawled:
                            c.get()
                    else:
                        pass
            #  hrefs
            if self.type == "HTML":
                links = s.get_links()

                for link in links:
                    if link:
                        c = Crawler(link, self, "HTML", self.console)
                        if c.url not in crawled:
                            c.get()
                    else:
                        pass