Python TextBoundAnnotation 예제들

프로그래밍 언어: Python

네임스페이스/패키지 이름: annotation

클래스/타입: TextBoundAnnotation

hotexamples.com에서의 예제들: 3

Python TextBoundAnnotation - 3개의 예제가 발견되었습니다. 이것들은 오픈소스 프로젝트에서 추출된 Python의 annotation.TextBoundAnnotation에 대한 실세계 최고 등급의 예제들입니다. 예제들을 평가하여 예제의 품질 향상에 도움을 줄 수 있습니다.

자주 사용되는 메소드들

보기 숨기기

TextBoundAnnotation(3)

자주 사용되는 메소드들

TextBoundAnnotation (3)

예제 #1

파일 보기

파일: stanford.py 프로젝트: shalevy1/ActiveLearningAnnotationTool

def ner(xml, start_id=1):
    soup = _soup(xml)
    token_by_ids = _token_by_ids(soup)

    # Stanford only has Inside and Outside tags, so conversion is easy
    nes = []
    last_ne_tok = None
    prev_tok = None
    for _, _, tok in _tok_it(token_by_ids):
        if tok.ner != 'O':
            if last_ne_tok is None:
                # Start of an NE from nothing
                last_ne_tok = tok
            elif tok.ner != last_ne_tok.ner:
                # Change in NE type
                nes.append((last_ne_tok.start, prev_tok.end, last_ne_tok.ner, ))
                last_ne_tok = tok
            else:
                # Continuation of the last NE, move along
                pass
        elif last_ne_tok is not None:
            # NE ended
            nes.append((last_ne_tok.start, prev_tok.end, last_ne_tok.ner, ))
            last_ne_tok = None
        prev_tok = tok
    else:
        # Do we need to terminate the last named entity?
        if last_ne_tok is not None:
            nes.append((last_ne_tok.start, prev_tok.end, last_ne_tok.ner, ))

    curr_id = start_id
    for start, end, _type in nes:
        yield TextBoundAnnotation(((start, end), ), 'T%s' % curr_id, _type, '')
        curr_id += 1

예제 #2

파일 보기

파일: stanford.py 프로젝트: shalevy1/ActiveLearningAnnotationTool

def _pos(xml, start_id=1):
    soup = _soup(xml)
    token_by_ids = _token_by_ids(soup)

    curr_id = start_id
    for s_id, t_id, tok in _tok_it(token_by_ids):
        yield s_id, t_id, TextBoundAnnotation(((tok.start, tok.end, ), ),
                'T%s' % curr_id, tok.pos, '')
        curr_id += 1

예제 #3

파일 보기

파일: stanford.py 프로젝트: shalevy1/ActiveLearningAnnotationTool

def coref(xml, start_id=1):
    soup = _soup(xml)
    token_by_ids = _token_by_ids(soup)
    
    docs_e = soup.findall('document')
    assert len(docs_e) == 1
    docs_e = docs_e[0]
    # Despite the name, this element contains conferences (note the "s")
    corefs_e = docs_e.findall('coreference')
    if not corefs_e:
        # No coreferences to process
        raise StopIteration
    assert len(corefs_e) == 1
    corefs_e = corefs_e[0]

    curr_id = start_id
    for coref_e in corefs_e:
        if corefs_e.tag != 'coreference':
            # To be on the safe side
            continue

        # This tag is now a full corference chain
        chain = []
        for mention_e in coref_e.getiterator('mention'):
            # Note: There is a "representative" attribute signalling the most
            #   "suitable" mention, we are currently not using this
            # Note: We don't use the head information for each mention
            sentence_id = int(mention_e.find('sentence').text)
            start_tok_id = int(mention_e.find('start').text)
            end_tok_id = int(mention_e.find('end').text) - 1

            mention_id = 'T%s' % (curr_id, )
            chain.append(mention_id)
            curr_id += 1
            yield TextBoundAnnotation(
                    ((token_by_ids[sentence_id][start_tok_id].start,
                    token_by_ids[sentence_id][end_tok_id].end), ),
                    mention_id, 'Mention', '')

        yield EquivAnnotation('Coreference', chain, '')