Python NBmatrix 예제들

프로그래밍 언어: Python

네임스페이스/패키지 이름: models

클래스/타입: NBmatrix

hotexamples.com에서의 예제들: 2

Python NBmatrix - 2개의 예제가 발견되었습니다. 이것들은 오픈소스 프로젝트에서 추출된 Python의 models.NBmatrix에 대한 실세계 최고 등급의 예제들입니다. 예제들을 평가하여 예제의 품질 향상에 도움을 줄 수 있습니다.

자주 사용되는 메소드들

보기 숨기기

fit(1)

fit_transform(1)

transform(1)

예제 #1

파일 보기

파일: ensemble_script2.py 프로젝트: wgharbi/OpinionMining

 data_test = tfv.transform(data_test)
 
 # Feature Selection Tfidf
 chi = SelectKBest(chi2,k=k_best_nb)
 data_train = chi.fit_transform(data_train,labels_train)
 data_test = chi.transform(data_test)
 
 # Feature Selection CountVectorizer
 chi = SelectKBest(chi2,k=k_best_nb)
 chi.fit_transform(count_matrix,labels_train)
 count_matrix = chi.transform(count_matrix)
 count_test = chi.transform(count_test)
 
 
 # Nbmatrix
 nbmat = NBmatrix(1.0,bina=True,n_jobs=1)
 nbmat.fit(count_matrix,labels_train)
 nbm_test = nbmat.transform(count_test)
 nbm_data = nbmat.transform(count_matrix)
 
 ########################### Train part ########################################
 
 # First Layer Models for TF-IDF
 proba1, basic_score1, basic_name1 = first_layer(basic_model1, data_train, labels_train,data_train,labels_train)
 # First Layer Models for New features
 proba2,basic_score2, basic_name2 = first_layer(basic_model2, new_mat_train, labels_train,new_mat_train,labels_train)
 # First Layer Nbmatrix
 proba3, basic_score3, basic_name3 = first_layer(basic_model3, nbm_data, labels_train,nbm_data,labels_train)
 # Grouping the first layer probas
 proba = np.hstack([proba1,proba2,proba3])

예제 #2

파일 보기

파일: __main__.py 프로젝트: wgharbi/OpinionMining

#X= X_T

#Remove html tags
train = ct.removehtml(data)

#Create the dictionnary (WARNING nltk should be up-to-date)
data=ct.stemTokenize(train)  

#Compute tf-idf including n_grams of size 2 
tfidf_vectorizer = TfidfVectorizer(ngram_range=(1,2), binary=False)

#Compute a count_vectorizer including n_grams of size 2
count_vectorizer = CountVectorizer(ngram_range=(1,2),binary=False)

#Comptute a NB matrix as describe by Wang & Manning
nb_vectorizer = NBmatrix(alpha = 1.0 ,bina = True, n_jobs = 1)

#Fit transform on the data
tfidf_matrix = tfidf_vectorizer.fit_transform(data)
count_matrix = count_vectorizer.fit_transform(data)
nb_matrix = nb_vectorizer.fit_transform(count_matrix,labels)

print "size of the matrix : ", tfidf_matrix.shape
average_nb_words = np.mean(count_matrix.sum(axis=1))
print "Average number of words per review : ", average_nb_words
dic_size = count_matrix.shape[1]
print "dictionnary size : " , dic_size
sparsity = 1-float(count_matrix.nnz)/(25000.0*dic_size)
print "Sparsity of the data : ", sparsity