Python URL.link_title 예제들

프로그래밍 언어: Python

네임스페이스/패키지 이름: bububa.SuperMario.utils

클래스/타입: URL

메소드/함수: link_title

hotexamples.com에서의 예제들: 2

Python URL.link_title - 2개의 예제가 발견되었습니다. 이것들은 오픈소스 프로젝트에서 추출된 Python의 bububa.SuperMario.utils.URL.link_title에 대한 실세계 최고 등급의 예제들입니다. 예제들을 평가하여 예제의 품질 향상에 도움을 줄 수 있습니다.

자주 사용되는 메소드들

보기 숨기기

normalize(11)

baseurl(2)

link_title(2)

quote(1)

rss_link(1)

예제 #1

파일 보기

파일: Mario.py 프로젝트: AmoebaFactor/SuperMario

 def next_depth(self, response):
     #with_timeout(1, self.lightcloud.set, LightCloud.crawled_url_key(response.effective_url), response.url, timeout_value=None)
     for link, title in URL.link_title(response.body, response.effective_url):
         if not self.inject_url(link, response.args):continue
         self.link_title_db.add(link, response.effective_url, title)
     if callable(self.callback): self.callback(response)
     self.crawled[response.effective_url] = 2
     if response.effective_url != response.url:
         self.crawled[response.url] = 2
     self.referer = response.effective_url

예제 #2

파일 보기

파일: bsp.py 프로젝트: AmoebaFactor/SuperMario

 def parser(self, html, sp, homepage):
     if not html: return None
     links = []
     if sp == 'baidu':
         pattern = re.compile('nameEnc: "([^^].*?)"')
         username = pattern.findall(html)
         if not username: return None
         link = 'http://frd.baidu.com/api/friend.getlist?un=%s'%username[0]
         mario = Mario()
         response = mario.get(link)
         if not response or not response.body: return None
         pattern = re.compile('\["([^^].*?)","[^^].*?","[^^].*?","[^^].*?",\d+,"[^^].*?",\d+,\d+\]')
         names = pattern.findall(response.body)
         if not names: return None
         bsp = BSP()
         for n in names:
             u = bsp.normalize('http://hi.baidu.com/sys/checkuser/%s'%n)
             if u and u[1] != homepage and u[1] not in links:
                 links.append(u)
     elif sp == 'sohu':
         pattern = re.compile('"link" : "([^^].*?)"', re.I)
         urls = pattern.findall(html)
         bsp = BSP()
         for url in urls:
             r = bsp.normalize(url)
             if r and r[1] != homepage and r[1] not in links:
                 links.append(r[1])
     elif sp == '163':
         pattern = re.compile('.userName="******"')
         usernames = pattern.findall(html)
         links = []
         bsp = BSP()
         for u in usernames:
             if not u: continue
             link = bsp.valid163(u, 'http:%s.blog.163.com/'%u, '163')
             if link and link[1] and link[1] not in links: links.append(link[1])
     else:
         bsp = BSP()
         for link, title in URL.link_title(html, homepage):
             if not link:
                 continue
             r = bsp.normalize(link)
             if r and r[1] != homepage and r[1] not in links:
                 links.append(r[1])
     return links