Exemplos de MyHTMLParser.get_links em Python

Linguagem de programação: Python

Espaço para nome / nome do pacote: htmlparser

Classe / Tipo: MyHTMLParser

Método / Função: get_links

Exemplos em hotexamples.com: 2

MyHTMLParser.get_links em Python - 2 exemplos encontrados. Esses são os exemplos do mundo real mais bem avaliados de htmlparser.MyHTMLParser.get_links em Python extraídos de projetos de código aberto. Você pode avaliar os exemplos para nos ajudar a melhorar a qualidade deles.

Métodos Frequentes

Exibir Ocultar

MyHTMLParser(12)

feed(9)

get_starttag_data(4)

close(1)

get_links(1)

get_mboxes_links(1)

get_results(1)

getdataset(1)

getsubject(1)

gettags(1)

Métodos Frequentes

MyHTMLParser (12)

feed (9)

get_starttag_data (4)

close (1)

get_links (1)

get_mboxes_links (1)

get_results (1)

getdataset (1)

getsubject (1)

gettags (1)

Exemplo n.º 1

0

Exibir arquivo

Arquivo: backends.py Projeto: GregSutcliffe/MailingListStats

def fetch(self): """Get all the links listed in the Mailing List's URL. The archives are usually retrieved in descending chronological order (newest archives are always shown on the top of the archives). Reverse the list to analyze in chronological order. """ mailing_list = self.mailing_list htmlparser = MyHTMLParser(mailing_list.location, self.web_user, self.web_password) # links = htmlparser.get_mboxes_links(self.force) links = self.filter_links(htmlparser.get_links()) for link in links: basename = os.path.basename(link) destfilename = os.path.join(mailing_list.compressed_dir, basename) try: # If the URL is for the current month, always retrieve. # Otherwise, check visited status & local files first this_month = find_current_month(link) if this_month: self._print_output( 'Current month detected: ' 'Found substring %s in URL %s...' % (this_month, link)) self._print_output('Retrieving %s...' % link) self._retrieve_remote_file(link, destfilename) elif os.path.exists(destfilename) and not self.force: self._print_output('Already downloaded %s' % link) else: self._print_output('Retrieving %s...' % link) self._retrieve_remote_file(link, destfilename) except IOError: self._print_output("Unknown URL: " + link + ". Skipping.") continue yield MBoxArchive(destfilename, link)

Exemplo n.º 2

0

Exibir arquivo

def fetch(self): """Get all the links listed in the Mailing List's URL. The archives are usually retrieved in descending chronological order (newest archives are always shown on the top of the archives). Reverse the list to analyze in chronological order. """ mailing_list = self.mailing_list htmlparser = MyHTMLParser(mailing_list.location, self.web_user, self.web_password) # links = htmlparser.get_mboxes_links(self.force) links = self.filter_links(htmlparser.get_links()) for link in links: basename = os.path.basename(link) destfilename = os.path.join(mailing_list.compressed_dir, basename) try: # If the URL is for the current month, always retrieve. # Otherwise, check visited status & local files first this_month = find_current_month(link) if this_month: self._print_output('Current month detected: ' 'Found substring %s in URL %s...' % (this_month, link)) self._print_output('Retrieving %s...' % link) self._retrieve_remote_file(link, destfilename) elif os.path.exists(destfilename) and not self.force: self._print_output('Already downloaded %s' % link) else: self._print_output('Retrieving %s...' % link) self._retrieve_remote_file(link, destfilename) except IOError: self._print_output("Unknown URL: " + link + ". Skipping.") continue yield MBoxArchive(destfilename, link)