Skip to content

smartaec/pykit

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 
 
 

Repository files navigation

pykit

this is a python toolkit for mining chinese text (now) and other data (in future)

visual text mining features (module vistext)

  1. extract keywords based on TFIDF, TextRank(based on https://github.com/fxsjy/jieba)
  2. extract key phrases based on some code from https://github.com/letiantian/TextRank4ZH/tree/master/textrank4zh
  3. generate word cloud based on WordCloud (https://github.com/amueller/word_cloud)
  4. generate word frequencies and co-occurrence network (from https://github.com/ipython/talks/blob/master/parallel/text_analysis.py)
  5. create word2vec model based on gensim
  6. generate dendrogram of keywords based on word vectors
  7. cluster keywords based on kmeans

##License All materials in this repository are licensed CC-BY, and I encourage reuse!

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages