Python SimpleCorpus.SimpleCorpus示例

编程语言: Python

命名空间/包名称: marmot.util.simple_corpus

类/类型: SimpleCorpus

方法/功能: SimpleCorpus

hotexamples.com的示例: 4

Python SimpleCorpus.SimpleCorpus - 已找到4个示例。这些是从开源项目中提取的最受好评的marmot.util.simple_corpus.SimpleCorpus.SimpleCorpus现实Python示例。您可以评价示例，以帮助我们提高示例质量。

常用方法

显示隐藏

SimpleCorpus(4)

get_texts(4)

示例#1

显示文件

文件： parsers.py 项目： tien-le-grenoble/marmot

def extract_important_tokens(corpus_file, min_count=1):
    corpus = SimpleCorpus(corpus_file)
    word_counts = defaultdict(int)
    for context in corpus.get_texts():
        for word in context:
            word_counts[word] += 1
    return set([k for k, v in word_counts.items() if v >= min_count])

示例#2

显示文件

文件： parsers.py 项目： tien-le-grenoble/marmot

def get_corpus_file(corpus_file, label):
    corpus = SimpleCorpus(corpus_file)
    return (label, corpus.get_texts())

示例#3

显示文件

文件： parsers.py 项目： tien-le-grenoble/marmot

def parse_corpus_contexts(corpus_file, interesting_tokens=None, tag=1):
    corpus = SimpleCorpus(corpus_file)
    return list_of_target_contexts(corpus.get_texts(),
                                   interesting_tokens,
                                   tag=tag)

示例#4

显示文件

文件： test_parsers.py 项目： tien-le-grenoble/marmot

 def setUp(self):
     self.interesting_tokens = set(['the','it'])
     module_path = os.path.dirname(__file__)
     self.corpus_path = os.path.join(module_path, 'test_data/corpus.en.1000')
     self.corpus = SimpleCorpus(self.corpus_path)