Python DOM.get_elements_by_classnameの例

プログラミング言語: Python

名前空間/パッケージ名: pattern.web

クラス/型: DOM

メソッド/関数: get_elements_by_classname

hotexamples.comのコード掲載数: 1

Python DOM.get_elements_by_classname - 1件のコード例が見つかりました。すべてオープンソースプロジェクトから抽出されたPythonのpattern.web.DOM.get_elements_by_classnameの実例で、最も評価が高いものを厳選しています。コード例の評価を行っていただくことで、より質の高いコード例が表示されるようになります。

よく使われるメソッド

表示非表示

DOM(30)

by_tag(20)

by_class(9)

by_attribute(3)

by_attr(1)

by_id(1)

get_elements_by_classname(1)

get_elements_by_tagname(1)

コード例 #1

ファイルを表示

ファイル: imdb-crawler.py プロジェクト: Maarten-vd-Sande/Data-Processing

def scrape_top_250(url):
    """
    Scrape the IMDB top 250 movies index page.

    Args:
        url: pattern.web.URL instance pointing to the top 250 index page

    Returns:
        A list of strings, where each string is the URL to a movie's page on
        IMDB, note that these URLS must be absolute (i.e. include the http
        part, the domain part and the path part).
    """
    movie_urls = []
    dom = DOM(url.download(cached=True))

    allurls = dom.get_elements_by_classname("titleColumn")
    for oneurl in allurls:
        link = abs(oneurl[1].attrs.get("href", ""), base=url.redirect or url.string)
        movie_urls.append(link)

    # return the list of URLs of each movie's page on IMDB
    return movie_urls