Python IdFreqDict.dump_dict Exemples

Langage de programmation: Python

Espace de nommage/Pack: utils.id_freq_dict

Class/Type: IdFreqDict

Méthode/Fonction: dump_dict

Exemples au hotexamples.com: 2

Python IdFreqDict.dump_dict - 2 exemples trouvés. Ce sont les exemples réels les mieux notés de utils.id_freq_dict.IdFreqDict.dump_dict extraits de projets open source. Vous pouvez noter les exemples pour nous aider à en améliorer la qualité.

Méthodes fréquemment utilisées

Afficher Cacher

IdFreqDict(18)

merge_freq_from(7)

vocabulary_size(7)

count_word(5)

drop_words_by_condition(5)

clear(4)

drop_freq_from(4)

dump_dict(2)

load_dict(2)

word_freq_enumerate(2)

count_words(1)

reset_id(1)

word2id(1)

Méthodes fréquemment utilisées

IdFreqDict (18)

merge_freq_from (7)

vocabulary_size (7)

count_word (5)

drop_words_by_condition (5)

clear (4)

drop_freq_from (4)

dump_dict (2)

load_dict (2)

word_freq_enumerate (2)

Méthodes fréquemment utilisées

count_words (1)

reset_id (1)

word2id (1)

Exemple #1

0

Afficher le fichier

Fichier : summarization.py Projet : leeyanghaha/my_merge

def get_tokens_multi(file_path): file_path = fi.add_sep_if_needed(file_path) # subfiles = au.random_array_items(fi.listchildren(file_path, children_type=fi.TYPE_FILE), 20) subfiles = fi.listchildren(file_path, children_type=fi.TYPE_FILE) file_list_block = mu.split_multi_format([(file_path + subfile) for subfile in subfiles], process_num=20) res_list = mu.multi_process(get_tokens, [(file_list,) for file_list in file_list_block]) id_freq_dict, total_doc_num = IdFreqDict(), 0 for ifd, doc_num in res_list: total_doc_num += doc_num id_freq_dict.merge_freq_from(ifd) print('total_doc_num', total_doc_num, 'total vocabulary_size', id_freq_dict.vocabulary_size()) id_freq_dict.drop_words_by_condition(3) id_freq_dict.dump_dict(getcfg().post_dict_file)

Exemple #2

0

Afficher le fichier

self.ifd.load_dict(ifd_file) # pre_dict_file = getcfg().pre_dict_file post_dict_file = getcfg().post_dict_file token_dict = IfdGetter(post_dict_file) # pre_list = [getcfg().pre_prop_file, getcfg().pre_comm_file, getcfg().pre_verb_file, getcfg().pre_hstg_file] # post_list = [getcfg().post_prop_file, getcfg().post_comm_file, getcfg().post_verb_file, getcfg().post_hstg_file] # prop_dict, comm_dict, verb_dict, hstg_dict = [IfdGetter(post_file) for post_file in post_list] if __name__ == '__main__': import utils.pattern_utils as pu def word_remove(word, freq): if pu.search_pattern(r'!?<>.,&\'`\^*', word) is not None or freq < 10: return True return False pre2post = dict(zip(pre_list, post_list)) for pre, post in pre2post.items(): ifd = IdFreqDict() ifd.load_dict(pre) pre_vocab = ifd.vocabulary_size() print('{} loaded, {} words'.format(pre, pre_vocab)) ifd.drop_words_by_condition(word_remove) print('{} words dropped, remain {} words'.format( pre_vocab - ifd.vocabulary_size(), ifd.vocabulary_size())) ifd.dump_dict(post) print('dump over')