Python Parser.clearText Examples

Programming Language: Python

Namespace/Package Name: goose.parsers

Class/Type: Parser

Method/Function: clearText

Examples at hotexamples.com: 2

Python Parser.clearText - 2 examples found. These are the top rated real world Python examples of goose.parsers.Parser.clearText extracted from open source projects. You can rate examples to help us improve the quality of examples.

Frequently Used Methods

Show Hide

getAttribute(9)

fromstring(9)

css_select(6)

getPath(4)

hasChildTag(3)

clearText(2)

createElement(2)

getFormattedText(2)

hasChildTags(2)

adjustTopNode(1)

childNodesWithText(1)

getComments(1)

getElementById(1)

removeTitle(1)

Example #1

Show file

File: extractors.py Project: iKalin/python-goose

    def getMetaContent(self, doc, metaName):
        """\
        Extract a given meta content form document
        """
        meta = doc.cssselect(metaName)
        content = None

        if meta is not None and len(meta) > 0:
            content = meta[0].attrib.get('content')

        if content is not None:
            return Parser.clearText(content.strip())

        return ''

Example #2

Show file

File: extractors.py Project: iKalin/python-goose

    def extractTags(self, article):
        node = article.doc

        # node doesn't have chidren
        if len(node) == 0:
            return NO_STRINGS

        elements = node.cssselect(A_REL_TAG_SELECTOR)
        if not elements:
            elements = node.cssselect(A_HREF_TAG_SELECTOR)
            if not elements:
                return NO_STRINGS

        tags = []
        for el in elements:
            tag = Parser.clearText(Parser.getText(el).strip())
            if tag:
                tags.append(tag)

        return set(tags)