Python OnlineClustererの例

プログラミング言語: Python

名前空間/パッケージ名: analysis.clustering.online

クラス/型: OnlineClusterer

hotexamples.comのコード掲載数: 7

Python OnlineClusterer - 7件のコード例が見つかりました。すべてオープンソースプロジェクトから抽出されたPythonのanalysis.clustering.online.OnlineClustererの実例で、最も評価が高いものを厳選しています。コード例の評価を行っていただくことで、より質の高いコード例が表示されるようになります。

よく使われるメソッド

表示非表示

OnlineClusterer(3)

cluster(3)

add_document(2)

dump_clusters_to_file(1)

plot_growth_timeline(1)

plot_scatter(1)

trimclusters(1)

コード例 #1

ファイルを表示

 def test_sample_doc_clustering_with_online(self):
     oc = OnlineClusterer(N=2, window=3)
     samples = get_orange_clustering_test_data()
     for document in samples:
         index = oc.add_document(document)
         oc.cluster(document)
     expected = [0, 0, 0, 1, 1, 1]
     for cluster in oc.clusters:
         print cluster.document_dict

コード例 #2

ファイルを表示

ファイル: online_cluster_tests.py プロジェクト: aurora1625/pythia

 def test_sample_doc_clustering_with_online(self):
     oc = OnlineClusterer(N=2, window=3)        
     samples = get_orange_clustering_test_data()
     for document in samples:
         index = oc.add_document(document)
         oc.cluster(document)
     expected = [0, 0, 0, 1, 1, 1]
     for cluster in oc.clusters:
         print cluster.document_dict

コード例 #3

ファイルを表示

    def test_cluster_term_document_matrix(self):
        oc = OnlineClusterer(N=2, window=3)
        for document in samples:
            index = oc.add_document(document)
            oc.cluster(document)

        calculated = oc.td_matrix
        expected = numpy.array(
            [[0.31388923, 0.11584717, 0, 0, 0, 0, 0.47083384],
             [0, 0.13515504, 0.3662041, 0, 0.3662041, 0, 0],
             [0, 0, 0, 0.54930614, 0, 0.549306140, 0]])

        self.assertEqual(expected.all(), calculated.all())

コード例 #4

ファイルを表示

ファイル: online_cluster_tests.py プロジェクト: aurora1625/pythia

    def test_cluster_term_document_matrix(self):
        oc = OnlineClusterer(N=2, window=3)        
        for document in samples:
            index = oc.add_document(document)
            oc.cluster(document)
 
            
        calculated = oc.td_matrix
        expected = numpy.array([[ 0.31388923,  0.11584717,  0,           0,           0,           0,           0.47083384], 
                                [ 0,           0.13515504,  0.3662041,   0,           0.3662041,   0,           0         ],      
                                [ 0,           0,           0,           0.54930614,  0,           0.549306140, 0        ]])
        
        self.assertEqual(expected.all(), calculated.all())

コード例 #5

ファイルを表示

ファイル: online_clustering_tests.py プロジェクト: aurora1625/pythia

    def test_online_clustering_with_tweets(self):
        from_date = datetime.datetime(2011, 1, 25, 0, 0, 0)
        to_date = datetime.datetime(2011, 1, 26, 0, 00, 0) 
        items = ws.get_top_documents_by_date(from_date, to_date, threshold=1000)             
        
        window = 300
        oc = OnlineClusterer(N=50, window = window)
        for item in items:
            oc.cluster(item)

        clusters=oc.trimclusters()            
        oc.dump_clusters_to_file("online_with_tweets")
        oc.plot_scatter()
        oc.plot_growth_timeline(cumulative=True)

        for cluster in oc.clusters:
            print cluster.id
            print cluster.get_size()
            print '-----------------'

コード例 #6

ファイルを表示

ファイル: online_clustering_tests.py プロジェクト: giorgosera/pythia-hackathon

    def test_online_clustering_with_tweets(self):
        from_date = datetime.datetime(2011, 1, 25, 0, 0, 0)
        to_date = datetime.datetime(2011, 1, 26, 0, 00, 0) 
        items = ws.get_documents_by_date(from_date, to_date, limit=200)             
        
        window = 100
        oc = OnlineClusterer(N=50, window = window)
        for item in items:
            oc.cluster(item)

        clusters=oc.trimclusters()            
        oc.dump_clusters_to_file("online_with_tweets")
        #oc.plot_scatter()
        #oc.plot_growth_timeline(cumulative=True)

        for cluster in oc.clusters:
            sorted = cluster.summarize()
            for doc in sorted:
                print doc.dist, doc.raw
            print '--------------------'

コード例 #7

ファイルを表示

ファイル: online_clustering_tests.py プロジェクト: nihaofuyue0617/pythia

    def test_online_clustering_with_tweets(self):
        from_date = datetime.datetime(2011, 1, 25, 0, 0, 0)
        to_date = datetime.datetime(2011, 1, 26, 0, 00, 0)
        items = ws.get_top_documents_by_date(from_date,
                                             to_date,
                                             threshold=1000)

        window = 300
        oc = OnlineClusterer(N=50, window=window)
        for item in items:
            oc.cluster(item)

        clusters = oc.trimclusters()
        oc.dump_clusters_to_file("online_with_tweets")
        oc.plot_scatter()
        oc.plot_growth_timeline(cumulative=True)

        for cluster in oc.clusters:
            print cluster.id
            print cluster.get_size()
            print '-----------------'