Esempi in Python per getpage

Linguaggio di programmazione: Python

Spazio dei nomi/nome del pacchetto: app.aml_utils

Metodo/funzione: getpage

Esempi su hotexamples.com: 3

getpage in Python: 3 esempi trovati. Questi sono i migliori esempi reali in Python per app.aml_utils.getpage, estratti da progetti open source. Li puoi valutare, per aiutarci a migliorare la qualità dei nostri esempi.

Esempio n. 1

Mostra file

File: news_source_entertainment.py Progetto: nandopedrosa/as_mais_lidas

def get_most_read(key):
    """
    Gets the most read news from a given page

    :param key: the key of the source page (e.g: g1)
    :return: a list with the most read news from the page and the name of news source
    """
    ns = get_ns(key)
    response, content = getpage(ns.url)  # Download the page
    soup = parsepage(content)  # Then we parse it
    strategy = strategies[key]  # Then we execute the selected Strategy based on the source
    return strategy(soup), ns.name

Esempio n. 2

Mostra file

File: news_source_technology.py Progetto: nandopedrosa/as_mais_lidas

def get_most_read(key):
    """
    Gets the most read news from a given page

    :param key: the key of the source page (e.g: g1)
    :return: a list with the most read news from the page and the name of news source
    """
    ns = get_ns(key)
    response, content = getpage(ns.url)  # Download the page
    soup = parsepage(content)  # Then we parse it
    strategy = strategies[key]  # Then we execute the selected Strategy based on the source
    return strategy(soup), ns.name

Esempio n. 3

Mostra file

File: news_source_national.py Progetto: nandopedrosa/as_mais_lidas

def g1(soup):
    """
    Gets the most read news from the g1 page

    :param soup: the BeautifulSoup object
    :return: a list with the most read news from the G1 Page
    """
    news = []
    scripts = soup.find_all('script')

    for script in scripts:
        script_content = script.text

        # O conteúdo do G1 agora é gerado por script. Primeiro achamos o script correto, pois são vários
        if script_content.find('#G1-POST-TOP') != -1:
            i = 0

            # Recuperamos as URLs mais acessadas
            while True:
                # Primeiro achamos um top-post (url) com essa chave de busca
                key_index = script_content.find('#G1-POST-TOP', i)

                if key_index == -1:
                    break

                # Agora achamos o começo da url
                start_index = script_content.rfind('"', 0, key_index) + 1

                # Agora achamos o final da url
                end_index = script_content.find('"', key_index)

                # E agora pegamos a URL (substring)
                url = script_content[start_index:end_index]

                # Com a URL, entramos na página e descobrimos o título dela
                response, content = getpage(url)
                soup2 = parsepage(content)
                title = soup2.find('h1', class_='content-head__title').string

                news.append(dict(title=title, link=url))

                # Preparamos o próximo índice de busca
                i = key_index + 10

    return news