Python CasUtil 예제들

프로그래밍 언어: Python

네임스페이스/패키지 이름: pipeline

클래스/타입: CasUtil

hotexamples.com에서의 예제들: 6

Python CasUtil - 6개의 예제가 발견되었습니다. 이것들은 오픈소스 프로젝트에서 추출된 Python의 pipeline.CasUtil에 대한 실세계 최고 등급의 예제들입니다. 예제들을 평가하여 예제의 품질 향상에 도움을 줄 수 있습니다.

자주 사용되는 메소드들

보기 숨기기

get_annotations(4)

get_all_annotations(3)

has_annotation(1)

예제 #1

파일 보기

파일: casConsumer.py 프로젝트: Tooa/BTWitter

    def process(self, cas):
        lang = next(CasUtil.get_annotations(cas, "Language"))

        if lang.value != "de":
            return

        filtered_token = []
        for annot in CasUtil.get_all_annotations(cas):
            self.add_to_filtered_token(annot, filtered_token)

        for t in filtered_token:
            self.unique_token[t.lower()] = 1 if CasUtil.has_annotation(cas, t, 'NER') else 0

        self.write_output_files(cas, filtered_token)

예제 #2

파일 보기

파일: normalizer.py 프로젝트: Tooa/BTWitter

    def process(self, cas):
        for token_annot in CasUtil.get_annotations(cas, "Token"):
            token = token_annot.get_covered_text()
            normalized = self.normalize_word_token(token)

            if normalized != token:
                norm_annot = Annotation(cas.get_view(), token_annot.begin, token_annot.end, "Error", normalized)
                cas.add_fs_annotation(norm_annot)

예제 #3

파일 보기

파일: casConsumer.py 프로젝트: Tooa/BTWitter

 def process(self, cas):
     self.f.write('<document id=' + str(cas.document_id) + '>\n')
     self.f.write('\t<text>' + cas.artifact + '</text>\n')
     self.f.write('\t<annotations>\n')
     for annot in CasUtil.get_all_annotations(cas):
         xml = '\t\t<annotation'
         xml += ' begin=' + str(annot.begin)
         xml += ' end=' + str(annot.end)
         xml += ' type=' + annot.type if annot.type else ''
         xml += ' value=' + str(annot.value) if annot.value else ''
         xml += ' />\n'
         self.f.write(xml)
     self.f.write('\t</annotations>\n')
     self.f.write('</document>\n\n')

예제 #4

파일 보기

파일: tokenTagger.py 프로젝트: Tooa/BTWitter

 def process(self, cas):
     for token_annot in CasUtil.get_annotations(cas, "Token"):
         token = token_annot.get_covered_text()
         if self.is_token_to_tag(token):
             annot = Annotation(cas.get_view(), token_annot.begin, token_annot.end, self.get_token_type())
             cas.add_fs_annotation(annot)

예제 #5

파일 보기

파일: casConsumer.py 프로젝트: Tooa/BTWitter

    def write_output_files(self, cas, filtered_token):
        self.sent_writer.writerow([cas.document_id, cas.date, cas.artifact])
        self.token_writer.writerow([cas.document_id, cas.date, " ".join(filtered_token)])

        raw_token = [annot.get_covered_text() for annot in CasUtil.get_annotations(cas, "Token")]
        self.raw_token_writer.writerow([" ".join(raw_token)])

예제 #6

파일 보기

파일: casConsumer.py 프로젝트: Tooa/BTWitter

 def process(self, cas):
     print("Artifact:", cas.artifact)
     for annot in CasUtil.get_all_annotations(cas):
         print(annot, annot.get_covered_text())