Python StrUtil 예제들

프로그래밍 언어: Python

네임스페이스/패키지 이름: appbasket.utils

클래스/타입: StrUtil

hotexamples.com에서의 예제들: 11

Python StrUtil - 11개의 예제가 발견되었습니다. 이것들은 오픈소스 프로젝트에서 추출된 Python의 appbasket.utils.StrUtil에 대한 실세계 최고 등급의 예제들입니다. 예제들을 평가하여 예제의 품질 향상에 도움을 줄 수 있습니다.

자주 사용되는 메소드들

보기 숨기기

delWhiteSpace(10)

completeURL(1)

자주 사용되는 메소드들

delWhiteSpace (10)

completeURL (1)

예제 #1

파일 보기

파일: baidu_spider.py 프로젝트: willis-hu/app-basket

    def getPageLink(self, selector, prefix):
        xpath = '//div[@class="pager"]//a/@href'

        eles = selector.xpath(xpath).extract()
        for i in range(len(eles)):
        	eles[i] = StrUtil.completeURL(prefix, eles[i])

        return filter(StrUtil.isEmpty, eles)

예제 #2

파일 보기

파일: wandoujia_spider.py 프로젝트: willis-hu/app-basket

    def getTag(self, selector, item):
        xpath = '//div[@class="side-tags clearfix"]/div/a/text()'

        tag = ""
        tags = selector.xpath(xpath).extract()
        for i in range(len(tags)):
            if (i):
                tag = tag + "-" + StrUtil.delWhiteSpace(tags[i])
            else:
                tag = StrUtil.delWhiteSpace(tags[i])

        if (0 != len(tag)):
            item['tag'] = tag
        else:
            item['tag'] = "NULL"

        LogUtil.log("tag(%s)" % item['tag'])

        return

예제 #3

파일 보기

파일: wandoujia_spider.py 프로젝트: willis-hu/app-basket

    def getCategory(self, selector, item):
        xpath = '//dd[@class="tag-box"]/a/text()'

        category = ""
        categories = selector.xpath(xpath).extract()
        for i in range(len(categories)):
            if (i):
                category = category + "-" + StrUtil.delWhiteSpace(
                    categories[i])
            else:
                category = StrUtil.delWhiteSpace(categories[i])

        if (0 != len(category)):
            item['category'] = category
        else:
            item['category'] = "NULL"

        LogUtil.log("category(%s)" % item['category'])

        return

예제 #4

파일 보기

파일: baidu_spider.py 프로젝트: willis-hu/app-basket

    def getEditorComment(self, selector, item):
        xpath = '//div[@class="app-detail"]//span[@class="head-content"]/text()'

        eles = selector.xpath(xpath).extract()

        editor_comment = "NULL"
        if (0 != len(eles)):
            editor_comment = eles[0]
        item['editor_comment'] = StrUtil.delWhiteSpace(editor_comment)

        LogUtil.log("editor_comment(%s)" % item['editor_comment'])    

        return

예제 #5

파일 보기

파일: baidu_spider.py 프로젝트: willis-hu/app-basket

    def getName(self, selector, item):
        xpath = '//div[@class="app-intro"]//h1[@class="app-name"]/span/text()'

        eles = selector.xpath(xpath).extract()

        name = "NULL"
        if (0 != len(eles)):
            name = eles[0]

        item['name'] = StrUtil.delWhiteSpace(name)
        LogUtil.log("name(%s)" % item['name'])

        return

예제 #6

파일 보기

파일: wandoujia_spider.py 프로젝트: willis-hu/app-basket

    def getVersion(self, selector, item):
        # xpath = '//dl[@class="infos-list"]/dd[5]/text()'
        xpath = u'//dl[@class="infos-list"]/dt[text() = "版本"]/following::*[1]/text()'
        eles = selector.xpath(xpath).extract()

        if (0 != len(eles)):
            item['version'] = StrUtil.delWhiteSpace(eles[0])
        else:
            item['version'] = "NULL"

        LogUtil.log("version(%s)" % item['version'])

        return

예제 #7

파일 보기

파일: wandoujia_spider.py 프로젝트: willis-hu/app-basket

    def getName(self, selector, item):
        xpath = '//p[@class="app-name"]/span[@class="title" and @itemprop="name"]/text()'

        eles = selector.xpath(xpath).extract()

        name = "NULL"
        if (0 != len(eles)):
            name = eles[0]

        item['name'] = StrUtil.delWhiteSpace(name)
        LogUtil.log("name(%s)" % item['name'])

        return

예제 #8

파일 보기

파일: baidu_spider.py 프로젝트: willis-hu/app-basket

    def getDescInfo(self, selector, item):
        xpath = '//div[@class="app-detail"]//div[@class="brief-long"]/p//text()'

        eles = selector.xpath(xpath).extract()
        # eles = selector.xpath(xpath).xpath('string(., " ")').extract()

        desc_info = "NULL"
        if (0 != len(eles)):
            desc_info = " ".join(eles)
        item['desc_info'] = StrUtil.delWhiteSpace(desc_info)

        LogUtil.log("desc_info(%s)" % item['desc_info'])    

        return

예제 #9

파일 보기

파일: wandoujia_spider.py 프로젝트: willis-hu/app-basket

    def getDescInfo(self, selector, item):
        xpath = '//div[@itemprop="description"]//text()'

        eles = selector.xpath(xpath).extract()
        # eles = selector.xpath(xpath).xpath('string(., " ")').extract()

        desc_info = "NULL"
        if (0 != len(eles)):
            desc_info = " ".join(eles)
        item['desc_info'] = StrUtil.delWhiteSpace(desc_info)

        LogUtil.log("desc_info(%s)" % item['desc_info'])

        return

예제 #10

파일 보기

파일: baidu_spider.py 프로젝트: willis-hu/app-basket

    def getSource(self, selector, item):
        xpath = '//div[@class="app-intro"]//div[@class="origin-wrap"]//a[@class="origin"]/text()'

        item['source'] = "NULL"

        while True:
        	eles = selector.xpath(xpath).extract()

        	if (0 == len(eles)):
        		break
        	string = eles[0]
        	item['source'] = StrUtil.delWhiteSpace(string)

        	break

        LogUtil.log("source(%s)" % item['source'])    

        return

예제 #11

파일 보기

파일: wandoujia_spider.py 프로젝트: willis-hu/app-basket

    def loadStartURLs(self):
        prefix = "http://www.wandoujia.com/apps/"
        # 文件URL
        file = open('data/apps.txt', 'r')
        for line in file:
            self.start_urls.append(prefix + StrUtil.delWhiteSpace(line))
        file.close()

        # 固定URL
        self.start_urls.append("http://www.wandoujia.com/apps")  # 应用首页
        self.start_urls.append("http://www.wandoujia.com/category/app")  # 安卓软件
        self.start_urls.append(
            "http://www.wandoujia.com/category/game")  # 安卓游戏
        # self.start_urls.append("http://www.wandoujia.com/apps/air.jp.funkyland.AliceHouse2") # 旧版应用
        # self.start_urls.append("http://www.wandoujia.com/apps/com.tencent.mm") # 新版应用
        # self.start_urls.append("http://www.wandoujia.com/category/408") # 旅游出行首页

        return