GitHub - fancyspeed/keyword_extractor: keyword extractor (or tag extractor)

Algorithms for keyword (tag) extraction.

tf_idf_rank
- features: TF, IDF, pos-tagging
- filtering: by pos-tagging (such adj, conj...)
- ranking: TF * IDF
text_rank
- features: word neighbors
- filtering: by pos-tagging (such adj, conj...)
- ranking: TextRank, which like PageRank, while building a relation matrix according to the words' positions.
- reference: http://www.cse.unt.edu/~rada/papers/mihalcea.emnlp04.pdf
glm_rank (TODO)
- features: TF, IDF, pos-tagging, entity type, word position
- ranking: train and predict by classification model
semantic_rank (TODO)
- features: such as TF, IDF, POS, entity type ...
- first-ranking: classification model
- re-ranking: adjust based on word co-occurence (Kobe and Oneal support each other) or topic model (whether words are belongs the main topics)

##Requirements:

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
tagger		tagger
trie		trie
util		util
README.md		README.md