Python StringUtil.is_resource_url Examples

Programming Language: Python

Namespace/Package Name: util

Class/Type: StringUtil

Method/Function: is_resource_url

Examples at hotexamples.com: 2

Python StringUtil.is_resource_url - 2 examples found. These are the top rated real world Python examples of util.StringUtil.is_resource_url extracted from open source projects. You can rate examples to help us improve the quality of examples.

Frequently Used Methods

Show Hide

StringUtil(6)

joinWithSeparator(5)

removeNewlines(3)

is_email_or_phone(2)

replace_insensitive(2)

nl2br(2)

is_not_empty(2)

active_code_generator(2)

token_generator(2)

get_split_words(2)

formatDuration(2)

converttime(2)

damerau_levenshtein(1)

is_resource_url(1)

create_value(1)

convertweek(1)

process_url(1)

br2nl(1)

stringIsOnlyChars(1)

stringToDict(1)

is_empty(1)

Example #1

Show file

File: spider.py Project: 1oscar/tv.sohu_spider

	def _process_html(self, html):
		'''
		 * 解析内容中的url，存入集合中
		'''
		urls = set()
		soup = BeautifulSoup(html)
		soup_res = soup.findAll(True, {'href': re.compile(self._target['grab_url_reg'])})
		for res in soup_res:
			url = res['href']
			#如果是资源文件，则丢弃
			if StringUtil.is_resource_url(url): continue
			urls.add(url)
		'''
		 * 将解析的url set存入到redis中
		'''
		for url in urls:
			self._redis.sset('sohu::url', url)
		print "     >>>>finish push %s urls in `sohu::url` of redis" %len(urls)

Example #2

Show file

    def _process_html(self, html):
        '''
		 * 解析内容中的url，存入集合中
		'''
        urls = set()
        soup = BeautifulSoup(html)
        soup_res = soup.findAll(
            True, {'href': re.compile(self._target['grab_url_reg'])})
        for res in soup_res:
            url = res['href']
            #如果是资源文件，则丢弃
            if StringUtil.is_resource_url(url): continue
            urls.add(url)
        '''
		 * 将解析的url set存入到redis中
		'''
        for url in urls:
            self._redis.sset('sohu::url', url)
        print "     >>>>finish push %s urls in `sohu::url` of redis" % len(
            urls)