Python split_document 예제들

프로그래밍 언어: Python

네임스페이스/패키지 이름: negbio.pipeline.section_split

메소드/함수: split_document

hotexamples.com에서의 예제들: 3

Python split_document - 3개의 예제가 발견되었습니다. 이것들은 오픈소스 프로젝트에서 추출된 Python의 negbio.pipeline.section_split.split_document에 대한 실세계 최고 등급의 예제들입니다. 예제들을 평가하여 예제의 품질 향상에 도움을 줄 수 있습니다.

예제 #1

파일 보기

def process_collection(collection, metamap, splitter, parser, ptb2dep, lemmatizer, neg_detector, cuis, sec_title_patterns):
    for document in collection.documents:
        normalize_mimiccxr.normalize(document)
        section_split.split_document(document, sec_title_patterns)
        ssplit.ssplit(document, splitter)

    dner_mm.run_metamap_col(collection, metamap, cuis)

    for document in collection.documents:
        document = parse.parse(document, parser)
        document = ptb2ud.convert(document, ptb2dep, lemmatizer)
        document = negdetect.detect(document, neg_detector)
        cleanup.clean_sentences(document)

    return collection

예제 #2

파일 보기

파일: load.py 프로젝트: jjalfaro9/chexpert-labeler

    def load(self):
        """Load and clean the reports."""
        collection = bioc.BioCCollection()
        reports = pd.read_csv(self.reports_path,
                              header=None,
                              names=[REPORTS])[REPORTS].tolist()

        for i, report in enumerate(reports):
            clean_report = self.clean(report)
            document = text2bioc.text2document(str(i), clean_report)

            if self.extract_impression:
                document = section_split.split_document(document)
                self.extract_impression_from_passages(document)

            split_document = self.splitter.split_doc(document)

            assert len(split_document.passages) == 1,\
                ('Each document must have a single passage, ' +
                 'the Impression section.')

            collection.add_document(split_document)

        self.reports = reports
        self.collection = collection

예제 #3

파일 보기

    def prep_collection(self):
        """Apply splitter and create bioc collection"""
        collection = bioc.BioCCollection()
        for i, report in enumerate(self.reports):
            clean_report = self.clean(report)
            document = text2bioc.text2document(str(i), clean_report)

            if self.extract_impression:
                document = section_split.split_document(document)
                self.extract_impression_from_passages(document)

            split_document = self.splitter.split_doc(document)

            assert len(split_document.passages) == 1,\
                ('Each document must have a single passage, ' +
                 'the Impression section.')

            collection.add_document(split_document)
        self.collection = collection