Python AnnotationAllDocs.read_spans 예제들

프로그래밍 언어: Python

네임스페이스/패키지 이름: sparv

클래스/타입: AnnotationAllDocs

메소드/함수: read_spans

hotexamples.com에서의 예제들: 2

Python AnnotationAllDocs.read_spans - 2개의 예제가 발견되었습니다. 이것들은 오픈소스 프로젝트에서 추출된 Python의 sparv.AnnotationAllDocs.read_spans에 대한 실세계 최고 등급의 예제들입니다. 예제들을 평가하여 예제의 품질 향상에 도움을 줄 수 있습니다.

자주 사용되는 메소드들

보기 숨기기

AnnotationAllDocs(8)

read_attributes(3)

read_spans(2)

read(1)

예제 #1

파일 보기

def timespan_sql_no_dateinfo(
        corpus: Corpus = Corpus(),
        out: Export = Export("korp_timespan/timespan.sql"),
        docs: AllDocuments = AllDocuments(),
        token: AnnotationAllDocs = AnnotationAllDocs("<token>")):
    """Create timespan SQL data for use in Korp."""
    corpus_name = corpus.upper()
    token_count = 0

    for doc in docs:
        tokens = token.read_spans(doc)
        token_count += len(list(tokens))

    rows_date = [{
        "corpus": corpus_name,
        "datefrom": "0" * 8,
        "dateto": "0" * 8,
        "tokens": token_count
    }]
    rows_datetime = [{
        "corpus": corpus_name,
        "datefrom": "0" * 14,
        "dateto": "0" * 14,
        "tokens": token_count
    }]

    create_sql(corpus_name, out, rows_date, rows_datetime)

예제 #2

파일 보기

파일: info.py 프로젝트: heatherleaf/sparv-pipeline

def info_sentences(
        out: OutputCommonData = OutputCommonData("cwb.sentencecount"),
        sentence: AnnotationAllDocs = AnnotationAllDocs("<sentence>"),
        docs: AllDocuments = AllDocuments()):
    """Determine how many sentences there are in the corpus."""
    # Read sentence annotation and count the sentences
    sentence_count = 0
    for doc in docs:
        try:
            sentence_count += len(list(sentence.read_spans(doc)))
        except FileNotFoundError:
            pass

    if sentence_count == 0:
        log.info("No sentence information found in corpus")

    # Write sentencecount data
    out.write(str(sentence_count))