Python Document.invalid_sids Beispiele

Programmiersprache: Python

Namespace / Paketname: text.document

Klasse / Typ: Document

Methode / Funktion: invalid_sids

Beispiele auf hotexamples.com: 1

Python Document.invalid_sids - 1 Beispiele gefunden. Dies sind die am besten bewerteten Python Beispiele für die text.document.Document.invalid_sids, die aus Open Source-Projekten extrahiert wurden. Sie können Beispiele bewerten, um die Qualität der Beispiele zu verbessern.

Häufig verwendete Methoden

Anzeigen Verbergen

Document(16)

process_document(14)

sentence_tokenize(5)

sentences(4)

did(1)

invalid_sids(1)

Beispiel #1

Datei anzeigen

Datei: tempEval_corpus.py Projekt: lasigeBioTM/IBEnt

    def load_corpus(self, corenlpserver):
        # self.path is the base directory of the files of this corpus

#         if more than one file:
        trainfiles = [self.path + f for f in os.listdir(self.path) if not f.endswith('~')] # opens all files in folder (see config file)
        widgets = [pb.Percentage(), ' ', pb.Bar(), ' ', ' ', pb.Timer()]
        pbar = pb.ProgressBar(widgets=widgets, maxval=len(trainfiles)).start()
        for i, openfile in enumerate(trainfiles):
            # print("file: "+openfile)
            with open(openfile, 'r') as inputfile:
                newdoc = Document(inputfile.read(), process=False, did=os.path.basename(openfile), title = "titulo_"+os.path.basename(openfile))
            newdoc.process_document(corenlpserver, "biomedical") #process_document chama o tokenizer
            valid = True
            invalid_sids = []
            for s in newdoc.sentences:
                if s.text in ['[start section id="{}"]'.format(section) for section in self.invalid_sections]:
                    valid = False
                if not valid:
                    invalid_sids.append(s.sid)
                if s.text in ['[end section id="{}"]'.format(section) for section in self.invalid_sections]:
                    valid = True
                if (s.text.startswith("[") and s.text.endswith("]")) or s.text.istitle():
                    newdoc.title_sids.append(s.sid)
            newdoc.invalid_sids = invalid_sids
            logging.debug("invalid sentences: {}".format(invalid_sids))
            logging.debug("title sentences: {}".format(newdoc.title_sids))
            self.documents[newdoc.did] = newdoc
            pbar.update(i+1)