Python sexpr_tokenize 예제들

프로그래밍 언어: Python

네임스페이스/패키지 이름: nltk.tokenize.sexpr

메소드/함수: sexpr_tokenize

hotexamples.com에서의 예제들: 10

Python sexpr_tokenize - 10개의 예제가 발견되었습니다. 이것들은 오픈소스 프로젝트에서 추출된 Python의 nltk.tokenize.sexpr.sexpr_tokenize에 대한 실세계 최고 등급의 예제들입니다. 예제들을 평가하여 예제의 품질 향상에 도움을 줄 수 있습니다.

예제 #1

파일 보기

파일: data.py 프로젝트: Qing1201/TCMSA

        def post_order(s):
            nonlocal index
            nonlocal leaf_cnt
            label, phrase = s[1:-1].split(None, 1)
            leafs = sexpr.sexpr_tokenize(phrase)

            if len(leafs) == 2:
                lstr, rstr = leafs
                lrst = post_order(lstr)
                rrst = post_order(rstr)
                prts[lrst].append(index)
                prts[rrst].append(index)
            else:
                leaf_cnt += 1

            labels.append(label)
            prts.append([])
            # childs.append([lrst, rrst] if len(leafs)==2 else [])
            cur = index
            childs_list = [lrst, rrst] if len(leafs) == 2 else []
            childs_list.append(cur)  # self-loop
            childs.append(childs_list)
            # print(index)
            index += 1

            return cur

예제 #2

파일 보기

 def parse_subtree(self, s):
     try:
         root, children = s[1:-1].lstrip().split(' ', 1)
     except ValueError:
         root = ''
         children = s[1:-1].lstrip()
     return (root.strip(punctuation + ' '),
             sexpr_tokenize(children.strip()))

예제 #3

파일 보기

def parse(root_sexpr):
    label, sub_sexpr = root_sexpr[1:-1].split(None, 1)
    tokens = []
    stack = Stack()
    for sub_sexpr in reversed(sexpr.sexpr_tokenize(sub_sexpr)):
        stack.push(sub_sexpr)
    while not stack.empty:
        _, next_sexpr = stack.pop()[1:-1].split(None, 1)
        # Leaf: if the length of the next is 1 and the string isn't in brackets
        next_sexprs = sexpr.sexpr_tokenize(next_sexpr)
        if len(next_sexprs) == 1 and ('(' not in next_sexprs[0]
                                     and ')' not in next_sexprs):
            tokens.append(next_sexprs[0])
        # Otherwise, add them to the stack in reverse order
        else:
            for sub_sexpr in reversed(next_sexprs):
                stack.push(sub_sexpr)
    return label, ' '.join(tokens)

예제 #4

파일 보기

파일: data.py 프로젝트: Qing1201/TCMSA

        def post_order(s):
            label, phrase = s[1:-1].split(None, 1)
            leafs = sexpr.sexpr_tokenize(phrase)

            if len(leafs) == 2:
                lstr, rstr = leafs
                post_order(lstr)
                post_order(rstr)
            else:
                words.append(leafs[0])
                labels.append(label)

            return label

예제 #5

파일 보기

파일: tree_batch.py 프로젝트: tungk/cstlstm

def tokenize(x):
    """Tokenizes S-expression dependency parse trees that come with NLI data.

    This one has been tested here:
    https://github.com/timniven/hsnli/blob/master/hsnli/tests/tree_sexpr_tests.py

    Args:
      x: String, the tree (or subtree) S-expression.

    Returns:
      String, List(String), Boolean: tag, [S-expression for the node], is_leaf
        flag indicating whether this node is a leaf.
    """
    remove_outer_brackets = x[1:-1]
    if '(' not in remove_outer_brackets:  # means it's a leaf
        split = remove_outer_brackets.split(' ')
        tag, data = split[0], [split[1]]
    else:
        sexpr_tokenized = sexpr.sexpr_tokenize(remove_outer_brackets)
        tag = sexpr_tokenized[0]
        del sexpr_tokenized[0]
        data = sexpr_tokenized
    is_leaf = len(data) == 1 and not (data[0][0] == '(' and data[0][-1] == ')')
    return tag, data, is_leaf

예제 #6

파일 보기

파일: sentiment.py 프로젝트: zhaogang92/fold

def tokenize(s):
    # sexpr_tokenize can't parse 'foo bar', only '(foo) (bar)', so we
    # use split to handle the case of a leaf (e.g. 'label word').
    label, phrase = s[1:-1].split(None, 1)
    return label, sexpr.sexpr_tokenize(phrase)

예제 #7

파일 보기

def tokenize(s):
    labelAndDepth, phrase = s[1:-1].split(None, 1)
    label, outerContext, ent1Posit, ent2Posit = labelAndDepth.split("/")
    # classification
    return label, (sexpr.sexpr_tokenize(phrase), outerContext, ent1Posit,
                   ent2Posit)

예제 #8

파일 보기

 def tokenize(s):
     label, phrase = s[1:-1].split(None, 1)
     return label, sexpr.sexpr_tokenize(phrase)

예제 #9

파일 보기

파일: sentiment.py 프로젝트: wangbosdqd/fold

def tokenize(s):
  # sexpr_tokenize can't parse 'foo bar', only '(foo) (bar)', so we
  # use split to handle the case of a leaf (e.g. 'label word').
  label, phrase = s[1:-1].split(None, 1)
  return label, sexpr.sexpr_tokenize(phrase)

예제 #10

파일 보기

파일: tree_lstm.py 프로젝트: gottalottarock/hlstm

 def tokenize(self, s):
     if not s[1:-1].strip():
         return ['']
     return sexpr_tokenize(s[1:-1].strip())