GitHub - sophie4869/topic-model

Branches Tags

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
autoencoder		autoencoder
bow_text_cls/mxnet_		bow_text_cls/mxnet_
bow.py		bow.py
chi_,.txt		chi_,.txt
chi_n.backup		chi_n.backup
chi_n.txt		chi_n.txt
classify.py		classify.py
dict.txt		dict.txt
read.py		read.py
readme.txt		readme.txt
t.py		t.py
test.py		test.py
tfidf.py		tfidf.py
tfidf_cmd.py		tfidf_cmd.py

Repository files navigation

---.txt---
dict.txt: vocabulary for feature extraction
chi_n.txt: Chinese stopwords separated by '\n'
chi_,.txt: Chinese stopwords separated by ','
chi_,.backup: same as chi_,.txt
features_10000.txt: 10000 most common words from 26 data files
features_5000.txt: 5000 most common words from e-1 data files
jieba_cut_result.txt: jieba cut result of e-1.json
stanford_cut_result.txt: stanford cut result of ?
extraction.txt: e-1.json text content all together
jieba_pw_compareResult.txt: RT
output.txt: ? cut result(very large)
test*.txt: test text and its cutting result

---.py---
read.py: getText from .json file
bow.py: bag of words for jieba
tfidf.py: tfidf function
tfidf_cmd.py: run tfidf from command line
classify.py: classify based on bow

addstop.py: add stop word to three chi*.txt files
test.py: pwCount, jiebaCount, tfidf function
t.py: test
word2vec.py: word2vec using gensim


---.other---
jieba.model: Word2Vec model generated using jieba cutting

About

No description, website, or topics provided.

Readme

Activity

0 stars

1 watching

0 forks

Report repository

Releases

No releases published

Packages

No packages published

Languages

Python 100.0%

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

autoencoder

autoencoder

bow_text_cls/mxnet_

bow_text_cls/mxnet_

bow.py

bow.py

chi_,.txt

chi_,.txt

chi_n.backup

chi_n.backup

chi_n.txt

chi_n.txt

classify.py

classify.py

dict.txt

dict.txt

read.py

read.py

readme.txt

readme.txt

t.py

t.py

test.py

test.py

tfidf.py

tfidf.py

tfidf_cmd.py

tfidf_cmd.py

Repository files navigation

About

Releases

Packages

Languages

sophie4869/topic-model

Folders and files

Latest commit

History

Repository files navigation

About

Resources

Stars

Watchers

Forks

Languages