Python StatsTool.parse_topicの例

プログラミング言語: Python

名前空間/パッケージ名: stats_tool

クラス/型: StatsTool

メソッド/関数: parse_topic

hotexamples.comのコード掲載数: 5

Python StatsTool.parse_topic - 5件のコード例が見つかりました。すべてオープンソースプロジェクトから抽出されたPythonのstats_tool.StatsTool.parse_topicの実例で、最も評価が高いものを厳選しています。コード例の評価を行っていただくことで、より質の高いコード例が表示されるようになります。

よく使われるメソッド

表示非表示

load_data(7)

retrieve_feature(7)

remove_emoticon(5)

replace_url(4)

load_raw_data(3)

replace_target(3)

_retrieve_emoticon(2)

parse_spatial(2)

parse_topic(2)

preprocess(2)

random_shardlize(2)

remove_parenthesis(1)

コード例 #1

ファイルを表示

ファイル: stats_parser.py プロジェクト: BrightSirius/sentiment

def parse_topic_public_stats(in_path='../stats/train_public_stats',out_path='../test_data/topic_test_data'):
    st_t = time.time()
    topic_cnt, total_cnt = 0, 0
    topic2txt = {}
    with open(in_path, 'r') as f:
        for line in f:
            total_cnt += 1
            dic = json.loads(line.strip())
            txt = dic['text']
            topic = ST.parse_topic(txt)
            if not topic:
                continue
            topic2txt.setdefault(topic, list())
            topic2txt[topic].append(txt)
                
    topics = sorted(topic2txt.keys(), key=lambda x: len(topic2txt[x]), reverse=True)
    for t in topics:
        txts = topic2txt[t]
        if len(txts) > 7000:
            continue
        #print t, topic2txt[t]
        if len(txts) < 200:
            break
        for txt in txts:
            dic = {t:txt}
            ET.write_file(out_path, 'a', '%s\n' % json.dumps(dic))
        
    print 'total cnt: %s. topic stats cnt: %s' % (total_cnt, topic_cnt)
    print 'topic cnt: %s' % len(topic2txt)
    print 'time used: %.2f' % (time.time() - st_t)

コード例 #2

ファイルを表示

ファイル: online_analyser.py プロジェクト: BrightSirius/sentiment

 def update_profile_topic(self, raw_stats, tags):
     for txt, tag in zip(raw_stats, tags):
         topic = ST.parse_topic(txt)
         if not topic:
             continue
         self.profile_topic.setdefault(topic, {"P":0,"N":0,"O":0})
         self.profile_topic[topic][tag] += 1

コード例 #3

ファイルを表示

ファイル: online_analyser.py プロジェクト: BrightSirius/sentiment

 def update_profile_topic(self, raw_stats, tags):
     for txt, tag in zip(raw_stats, tags):
         topic = ST.parse_topic(txt)
         if not topic:
             continue
         self.profile_topic.setdefault(topic, {"P": 0, "N": 0, "O": 0})
         self.profile_topic[topic][tag] += 1

コード例 #4

ファイルを表示

ファイル: stats_parser.py プロジェクト: BrightSirius/sentiment

def parse_topic_public_stats(in_path='../stats/train_public_stats',
                             out_path='../test_data/topic_test_data'):
    st_t = time.time()
    topic_cnt, total_cnt = 0, 0
    topic2txt = {}
    with open(in_path, 'r') as f:
        for line in f:
            total_cnt += 1
            dic = json.loads(line.strip())
            txt = dic['text']
            topic = ST.parse_topic(txt)
            if not topic:
                continue
            topic2txt.setdefault(topic, list())
            topic2txt[topic].append(txt)

    topics = sorted(topic2txt.keys(),
                    key=lambda x: len(topic2txt[x]),
                    reverse=True)
    for t in topics:
        txts = topic2txt[t]
        if len(txts) > 7000:
            continue
        #print t, topic2txt[t]
        if len(txts) < 200:
            break
        for txt in txts:
            dic = {t: txt}
            ET.write_file(out_path, 'a', '%s\n' % json.dumps(dic))

    print 'total cnt: %s. topic stats cnt: %s' % (total_cnt, topic_cnt)
    print 'topic cnt: %s' % len(topic2txt)
    print 'time used: %.2f' % (time.time() - st_t)

コード例 #5

ファイルを表示

ファイル: simulator.py プロジェクト: BrightSirius/sentiment

 def parse_topics_realtime(self):
     topic_cnt, total_cnt = 0, 0
     topic2txt = {}
     for name, txts in self.stats:
         for txt in txts:
             total_cnt += 1
             topic = ST.parse_topic(txt)
             if not topic:
                 continue
             topic_cnt += 1
             topic2txt.setdefault(topic, list())
             topic2txt[topic].append(txt)
     print 'total cnt: %s. topic stats cnt: %s' % (total_cnt, topic_cnt)
     print 'topic cnt: %s' % len(topic2txt)
     return topic2txt