Skip to content

rohitsakala/semanticAnnotationAcmCategories

Repository files navigation

Semantic Annotation of ACM research papers using ACM classification tree

Semantic annotation is done through first representing words and documents in the vector space model using word2vec and doc2vec implementations, the vectors are taken as features into a classifier, trained and a model is made which can classify a document with acm classification tree 2012 categories.

Setup Instructions

    $ workon myvirtualenv                                  [Optional]
	$ pip3 install -r requirements.txt

Download the Dataset needed for ACM in the ACM Directory from here.

Building the Model

    $ python3 run.py

Classifying the Model

    $ python3 classify.py

##Mentors:

  • Course Instructor:
    • Vasudev Verma
  • TA:
    • Priya Radhakrishnan

##Major Packages Required

  • nltk
  • gensim
  • numpy
  • scikit-learn
  • pickle

Members:

Research Paper

Quoc V. Le, and Tomas Mikolov, ''Distributed Representations of Sentences and Documents ICML", 2014

Tomas Mikolov, Kai Chen, Greg Corrado, and Jeffrey Dean, “Efficient Estimation of Word Representations in Vector Space. In Proceedings of Workshop at ICLR”, 2013.

Cao et al., 2015, ''A Novel Neural Topic Model and Its Supervised Extension''. AAAI 2015

Link :- https://cs.stanford.edu/~quocle/paragraph_vector.pdf

Resources are available here.

About

Given a research paper, we need to label it with one of the acm classification tree 2012 categories

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages