Python unescape 예제들

프로그래밍 언어: Python

네임스페이스/패키지 이름: cocktails.utils

메소드/함수: unescape

hotexamples.com에서의 예제들: 4

Python unescape - 4개의 예제가 발견되었습니다. 이것들은 오픈소스 프로젝트에서 추출된 Python의 cocktails.utils.unescape에 대한 실세계 최고 등급의 예제들입니다. 예제들을 평가하여 예제의 품질 향상에 도움을 줄 수 있습니다.

예제 #1

파일 보기

파일: allPythonContent.py 프로젝트: Mondego/pyreco

def html_to_text(s):
	# strip tags
	s = re.sub(r'<\W*(?:b|big|i|small|tt|abbr|acronym|cite|code|dfn|em|kbd|strong|samp|var|a|bdo|q|span|sub|sup)\b[^>]*?>', '', s, flags=re.I)
	s = re.sub(r'<[^>]*?>', ' ', s)
	# replace entities
	s = unescape(s)
	# strip leading and trailing spaces
	s = s.strip()
	# replace all sequences of subsequent whitespaces with a single space
	s = re.sub(r'\s+', ' ', s)
	return s

예제 #2

파일 보기

def html_to_text(s):
    # strip tags
    s = re.sub(
        r'<\W*(?:b|big|i|small|tt|abbr|acronym|cite|code|dfn|em|kbd|strong|samp|var|a|bdo|q|span|sub|sup)\b[^>]*?>',
        '',
        s,
        flags=re.I)
    s = re.sub(r'<[^>]*?>', ' ', s)
    # replace entities
    s = unescape(s)
    # strip leading and trailing spaces
    s = s.strip()
    # replace all sequences of subsequent whitespaces with a single space
    s = re.sub(r'\s+', ' ', s)
    return s

예제 #3

파일 보기

    def parse_recipe(self, response):
        hxs = HtmlXPathSelector(response)

        for title in hxs.select(
                "//meta[@property='og:title']/@content").extract():
            break
        else:
            return []

        for picture in hxs.select(
                "//*[@id='drink_infopicvid']/img/@src").extract():
            picture = urljoin(response.url, picture)
            break
        else:
            picture = None

        ingredients = []
        for node in hxs.select("//ul[@id='ingredients']/li"):
            parts = []

            for child in node.select('* | text()'):
                text = html_to_text(child.extract())

                if 'ingredient' in (child.xmlNode.prop('class') or '').split():
                    text = text.split('--')[-1]

                text = text.strip()

                if not text:
                    continue

                parts.append(text)

            ingredients.append(' '.join(parts))

        # don't crawl recipes like 'American Whiskey & Canadian Whisky',
        # that only consist of pouring a single spirit into a glass.
        if len(ingredients) <= 1:
            return []

        return [
            CocktailItem(title=unescape(title),
                         picture=picture,
                         url=response.url,
                         source='Esquire',
                         ingredients=ingredients)
        ]

예제 #4

파일 보기

파일: esquire.py 프로젝트: snoack/cocktail-search

    def parse_recipe(self, response):
        hxs = HtmlXPathSelector(response)

        for title in hxs.select("//meta[@property='og:title']/@content").extract():
            break
        else:
            return []

        for picture in hxs.select("//*[@id='drink_infopicvid']/img/@src").extract():
            picture = urljoin(response.url, picture)
            break
        else:
            picture = None

        ingredients = []
        for node in hxs.select("//ul[@id='ingredients']/li"):
            parts = []

            for child in node.select('* | text()'):
                text = html_to_text(child.extract())

                if 'ingredient' in (child.xmlNode.prop('class') or '').split():
                    text = text.split('--')[-1]

                text = text.strip()

                if not text:
                    continue

                parts.append(text)

            ingredients.append(' '.join(parts))

        # don't crawl recipes like 'American Whiskey & Canadian Whisky',
        # that only consist of pouring a single spirit into a glass.
        if len(ingredients) <= 1:
            return []

        return [CocktailItem(
            title=unescape(title),
            picture=picture,
            url=response.url,
            source='Esquire',
            ingredients=ingredients
        )]