GitHub - IngleJaya95/Word-Similarity-Estimation-: Estimate similarities between words using Wordnet based measure, ESA, LSA, word2vec. Also, calculated correlation of all techniques with WordSim353 and compared them.

Code Information : Code consist of 6 folder, 5 for individual methods and 1 for google embedding All the code are well commented and presented in ipython jupyter notebook interactive shell. And make easy sense. Though in word to vec you have to execute make file followed by bash file for the code.

Software Requirement :

Java Netbeans
Python Packages a. Jupyter Notebook b. Pandas c. Numpy d. Scipy e. nltk f. Sklearn g. Seematch h. os (Having Anaconda Distribution will be appreciated)
GCC compiler for the C code. System Requirement : A decent system with 4 GB ram and 50 GB space on harddisk(ESA index occuppies most of it). A higher configuration system will always be appreciated.

WordSim353 file can vary from the corpus to corpus. Since it can be the case that all the words are not present in vocabulary.

Warning : Increasing data above a threshold in some particular codes can be dangerous for your system and you will be solely responsible for your actions. Python package like sklearn require large amount of memory for TF-IDF matrix. If convert from sparse storage format to normal numpy format.

Code is equally contributed by 1. me 2. Shubham Patel (https://github.com/lifeisshubh)

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
code		code
readme.md		readme.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

code

code

readme.md

readme.md

Repository files navigation

About

Releases

Packages

Languages

IngleJaya95/Word-Similarity-Estimation-

Folders and files

Latest commit

History

code

code

readme.md

readme.md

Repository files navigation

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages