Python SnapshotCorpus.SnapshotCorpus示例

编程语言: Python

命名空间/包名称: src.corpora

类/类型: SnapshotCorpus

方法/功能: SnapshotCorpus

hotexamples.com的示例: 4

Python SnapshotCorpus.SnapshotCorpus - 已找到4个示例。这些是从开源项目中提取的最受好评的src.corpora.SnapshotCorpus.SnapshotCorpus现实Python示例。您可以评价示例，以帮助我们提高示例质量。

常用方法

显示隐藏

SnapshotCorpus(4)

get_texts(2)

示例#1

显示文件

文件： test_corpora.py 项目： kidaak/doc2vec-feature-location

    def setUp(self):
        super(TestCorpusCombiner, self).setUp()
        # 3 documents
        p1 = self.Project(ref=u'2aeb2e7c78259833e1218b69f99dab3acd00970c',
                          level='file',
                          src_path=self.basepath)
        self.corpus1 = SnapshotCorpus(repo=self.repo,
                                      project=p1,
                                      remove_stops=False,
                                      lower=True,
                                      split=True,
                                      min_len=0)
        self.docs1 = list(self.corpus1)

        # 3 old documents + 2 new documents
        p2 = self.Project(ref=u'3587d37e7d476ddc7b673c41762dc89c8ca63a6a',
                          level='file',
                          src_path=self.basepath)
        self.corpus2 = SnapshotCorpus(repo=self.repo,
                                      project=p2,
                                      remove_stops=False,
                                      lower=True,
                                      split=True,
                                      min_len=0)
        self.docs2 = list(self.corpus2)

        self.corpus = CorpusCombiner([self.corpus1, self.corpus2])
        self.docs = list(self.corpus)

示例#2

显示文件

文件： test_corpora.py 项目： kidaak/doc2vec-feature-location

 def setUp(self):
     super(TestSnapshotCorpus, self).setUp()
     self.corpus = SnapshotCorpus(repo=self.repo,
                                  remove_stops=False,
                                  lower=True,
                                  split=True,
                                  min_len=0)
     self.docs = list(self.corpus)

示例#3

显示文件

文件： test_corpora.py 项目： kidaak/doc2vec-feature-location

 def setUp(self):
     super(TestSnapshotCorpusAtRef, self).setUp()
     p1 = self.Project(ref=u'f33a0fb070a34fc1b9105453b3ffb4edc49131d9',
                       level='file',
                       src_path=self.basepath)
     self.corpus = SnapshotCorpus(repo=self.repo,
                                  project=p1,
                                  remove_stops=False,
                                  lower=True,
                                  split=True,
                                  min_len=0)
     self.docs = list(self.corpus)

示例#4

显示文件

文件： test_corpora.py 项目： kidaak/doc2vec-feature-location

    def test_lazy(self):
        corpus = SnapshotCorpus(repo=self.repo,
                                remove_stops=False,
                                lower=True,
                                split=True,
                                min_len=0,
                                lazy_dict=True)

        self.assertEqual(len(corpus.id2word), 0)

        # if lazy, iterating over the corpus will now build the dict
        docs = list(corpus)

        self.assertGreater(len(corpus.id2word), 0)