Python TextPreprocessor.frequency_term_matrixの例

プログラミング言語: Python

名前空間/パッケージ名: predictocite.datasets.preprocessing

クラス/型: TextPreprocessor

メソッド/関数: frequency_term_matrix

hotexamples.comのコード掲載数: 3

Python TextPreprocessor.frequency_term_matrix - 3件のコード例が見つかりました。すべてオープンソースプロジェクトから抽出されたPythonのpredictocite.datasets.preprocessing.TextPreprocessor.frequency_term_matrixの実例で、最も評価が高いものを厳選しています。コード例の評価を行っていただくことで、より質の高いコード例が表示されるようになります。

よく使われるメソッド

表示非表示

split_data(4)

frequency_term_matrix(3)

bag_of_words(1)

コード例 #1

ファイルを表示

ファイル: test_preprocessing_data.py プロジェクト: RobSullivan/predictocite

	def test_create_frequency_term_matrix(self):
		"""
		Once have vocab indexed create frequency_term matrix 
		"""
		preprocessor = TextPreprocessor(self.articles)
		split_data = preprocessor.split_data()
		preprocessor.count_vect.fit_transform(split_data['train'])
		frequency_term_matrix = preprocessor.frequency_term_matrix(split_data['train']) #preprocessor.count_vect.transform(split_data['train'])
		
		self.assertTrue(hasattr(frequency_term_matrix, 'transpose'))

コード例 #2

ファイルを表示

ファイル: test_preprocessing_data.py プロジェクト: RobSullivan/predictocite

	def test_tfidf_weighting(self):
		preprocessor = TextPreprocessor(self.articles)
		split_data = preprocessor.split_data()
		term_freq_matrix = preprocessor.frequency_term_matrix(split_data['train'])

		#calculate the idf for term frequency matrix with fit()
		preprocessor.tf_transformer.fit(term_freq_matrix)
		# once calculated transform the term_freq_matrix
		# to the tf-idf weight matrix
		tf_idf_matrix = preprocessor.tf_transformer.transform(term_freq_matrix)
		
		self.assertTrue(hasattr(tf_idf_matrix.todense(), 'shape'))

コード例 #3

ファイルを表示

ファイル: test_preprocessing_data.py プロジェクト: RobSullivan/predictocite

	def test_term_frequency_features(self):
		"""
		tf-idf helper test
		The last step before classification
		"""
		#tfidf_transformer = TfidfTransformer()
		preprocessor = TextPreprocessor(self.articles)
		split_data = preprocessor.split_data()
		
		term_freq_matrix = preprocessor.frequency_term_matrix(split_data['train'])
		
		tfidf = preprocessor.tf_transformer.fit(term_freq_matrix)
		self.assertEqual(tfidf.norm, 'l2')