Python Clusterer 예제들

프로그래밍 언어: Python

네임스페이스/패키지 이름: datamining.clusterer

클래스/타입: Clusterer

hotexamples.com에서의 예제들: 4

Python Clusterer - 4개의 예제가 발견되었습니다. 이것들은 오픈소스 프로젝트에서 추출된 Python의 datamining.clusterer.Clusterer에 대한 실세계 최고 등급의 예제들입니다. 예제들을 평가하여 예제의 품질 향상에 도움을 줄 수 있습니다.

자주 사용되는 메소드들

보기 숨기기

cluster_and_evaluate_data(1)

cluster_data(1)

예제 #1

파일 보기

파일: business_clusterer.py 프로젝트: anuragreddygv323/yelp




data_folder = '../../../../../../datasets/yelp_phoenix_academic_dataset/'
business_file_path = data_folder + 'yelp_academic_dataset_business.json'
my_matrix = BusinessETL.create_category_matrix(business_file_path)
my_sets = BusinessETL.create_category_sets(business_file_path)
print 'Data pre-processing done'

# Clusterer.cluster_and_evaluate_data(my_matrix, 'k-means-scikit')
# Clusterer.cluster_and_evaluate_data(my_matrix, 'k-means-nltk')
# Clusterer.cluster_and_evaluate_data(my_matrix, 'mean-shift')
# Clusterer.cluster_and_evaluate_data(my_matrix, 'ward')
# Clusterer.cluster_and_evaluate_data(my_matrix, 'dbscan')
my_labels = Clusterer.cluster_data(my_matrix, 'dbscan')
my_categories = get_categories(business_file_path)

size = len(set(my_labels))
clusters = [[] for i in range(size)]

for i in xrange(len(my_labels)):
    if my_labels[i] == -1:
        clusters[size-1].append(binary_to_categories(my_matrix[i], my_categories))
    else:
        clusters[int(my_labels[i])].append(binary_to_categories(my_matrix[i], my_categories))
    # print my_labels[i]
# Clusterer.linkage(my_matrix[:3000])
# Clusterer.gaac(my_matrix[:500][:50])

sets = []

예제 #2

파일 보기

파일: tip_tfidf.py 프로젝트: neostoic/yelp-1

    def clustering(file_path):

        vectorized = TipTfidf.tf_idf_tips(file_path)
        Clusterer.cluster_and_evaluate_data(vectorized, 'k-means-scikit')

예제 #3

파일 보기




data_folder = '../../../../../../datasets/yelp_phoenix_academic_dataset/'
business_file_path = data_folder + 'yelp_academic_dataset_business.json'
my_matrix = BusinessETL.create_category_matrix(business_file_path)
my_sets = BusinessETL.create_category_sets(business_file_path)
print 'Data pre-processing done'

# Clusterer.cluster_and_evaluate_data(my_matrix, 'k-means-scikit')
# Clusterer.cluster_and_evaluate_data(my_matrix, 'k-means-nltk')
# Clusterer.cluster_and_evaluate_data(my_matrix, 'mean-shift')
# Clusterer.cluster_and_evaluate_data(my_matrix, 'ward')
# Clusterer.cluster_and_evaluate_data(my_matrix, 'dbscan')
my_labels = Clusterer.cluster_data(my_matrix, 'dbscan')
my_categories = get_categories(business_file_path)

size = len(set(my_labels))
clusters = [[] for i in range(size)]

for i in xrange(len(my_labels)):
    if my_labels[i] == -1:
        clusters[size-1].append(binary_to_categories(my_matrix[i], my_categories))
    else:
        clusters[int(my_labels[i])].append(binary_to_categories(my_matrix[i], my_categories))
    # print my_labels[i]
# Clusterer.linkage(my_matrix[:3000])
# Clusterer.gaac(my_matrix[:500][:50])

# counts = count_categories(clusters)

예제 #4

파일 보기

파일: tip_tfidf.py 프로젝트: antoine-tran/yelp

    def clustering(file_path):

        vectorized = TipTfidf.tf_idf_tips(file_path)
        Clusterer.cluster_and_evaluate_data(vectorized, 'k-means-scikit')