Python SentenceGenerator 예제들

프로그래밍 언어: Python

네임스페이스/패키지 이름: gensent

메소드/함수: SentenceGenerator

hotexamples.com에서의 예제들: 8

Python SentenceGenerator - 8개의 예제가 발견되었습니다. 이것들은 오픈소스 프로젝트에서 추출된 Python의 gensent.SentenceGenerator에 대한 실세계 최고 등급의 예제들입니다. 예제들을 평가하여 예제의 품질 향상에 도움을 줄 수 있습니다.

예제 #1

0

파일 보기

파일: test_gensent.py 프로젝트: kaleidoescape/hlt-2018

 def test_process_sentence_russian(self):
     sentences = gensent.SentenceGenerator(language='russian', lemma=True)
     result = sentences._process_sentence(
         "Три девицы под окном Пряли поздно вечерком.")
     correct = [
         'три', 'девица', 'под', 'окно', 'прясть', 'поздно', 'вечерок'
     ]
     self.assertEqual(result, correct)

예제 #2

0

파일 보기

파일: test_gensent.py 프로젝트: kaleidoescape/hlt-2018

 def test_dutch(self):
     sentences = gensent.SentenceGenerator(language='dutch', lemma=True)
     result = sentences._process_sentence("Ik ga naar buiten toe")
     correct = ['ik', 'gaan', 'naar', 'buiten', 'toe']
     self.assertEqual(result, correct)

예제 #3

0

파일 보기

파일: test_gensent.py 프로젝트: kaleidoescape/hlt-2018

 def test_process_numbers(self):
     sentences = gensent.SentenceGenerator()
     result = sentences._process_sentence("Pi is 3.14159")
     correct = ['pi', 'is', sentences.NUM]
     self.assertEqual(result, correct)

예제 #4

0

파일 보기

파일: test_gensent.py 프로젝트: kaleidoescape/hlt-2018

 def test_process_EU_money(self):
     sentences = gensent.SentenceGenerator()
     result = sentences._process_sentence("Breakfast cost me €5.60")
     correct = ['breakfast', 'cost', 'me', sentences.NUM]
     self.assertEqual(result, correct)

예제 #5

0

파일 보기

파일: test_gensent.py 프로젝트: kaleidoescape/hlt-2018

 def test_process_sentence(self):
     sentences = gensent.SentenceGenerator()
     result = sentences._process_sentence(self.sentence_list[0])
     correct = ['i', 'am', 'sam', 'sam-i-am']
     self.assertEqual(result, correct)

예제 #6

0

파일 보기

파일: test_gensent.py 프로젝트: kaleidoescape/hlt-2018

 def test_two_passes(self):
     """Make sure we can make two passes over the sentence generator iterator."""
     sentences = gensent.SentenceGenerator()
     sentences.read_sentence_list(self.sentence_list)
     #the list() function makes one pass over an iterator, so just do it 2x
     self.assertEqual(list(sentences), list(sentences))

예제 #7

0

파일 보기

파일: test_gensent.py 프로젝트: kaleidoescape/hlt-2018

 def test_generator_unprepared(self):
     """Make sure an unprepared sentence generator throws an error."""
     sentences = gensent.SentenceGenerator()
     self.assertRaises(Exception, sentences._gen_sentences())

예제 #8

0

파일 보기

        action='store_true',
        default=False,
        help='lemmatize the sentences before training word vectors')
    args = parser.parse_args()

    return args


args = parse_args()

print('Working on Dutch...')
start_time = time.time()

nl_direc = os.path.join(args.data_dir, 'nl')
nl_sents = gensent.SentenceGenerator(language='dutch',
                                     lemma=args.lemma,
                                     cstlemma_dir=args.cstlemma_dir)
nl_sents.read_directory(nl_direc)
nl_model = gensim.models.Word2Vec(nl_sents, **w2vconfig.gensim_config)
nl_vectors = nl_model.wv
print('Dutch word tokens: {}'.format(nl_sents.word_token_count))
print('Dutch vocab size: {}'.format(len(nl_model.wv.vocab)))
if args.lemma:
    nl_vectors_fp = os.path.join(args.vectors_dir, 'nl_vectors_lemma.txt')
else:
    nl_vectors_fp = os.path.join(args.vectors_dir, 'nl_vectors_nolemma.txt')
nl_vectors.save_word2vec_format(nl_vectors_fp, binary=False)

elapsed_time = time.time() - start_time
print('Elapsed time:', elapsed_time)