GitHub - adityamogadala/xLiMeSemanticIntegrator: More Information about Features, Deliverables and Publications @

This readme provides information about dependencies, installation instructions and how to get started with the code.

Dependencies

Code is Written in Python 2.7+ and Java. Also, it depends on.

xLiMe Meta Data Model
Numpy
Scipy
sklearn
gensim
LangID
[Rake] (https://github.com/aneesha/RAKE)
[Pymongo-2.8] (https://pypi.python.org/pypi/pymongo/2.8)
[Kafka Python Client] (https://github.com/dpkp/kafka-python)
MongoDB
Apache Kafka

Installation Instructions (Debian/Ubuntu)

$git clone https://github.com/adityamogadala/xLiMeSemanticIntegrator.git
Make sure you have python-dev and setuptools. Otherwise install
- sudo apt-get install python-dev
- sudo pip install --upgrade setuptools
BLAS/LAPACK are required for scipy and numpy. If not already present, install.
- sudo apt-get install libblas-dev liblapack-dev
$sudo pip install -r requirements.txt
$sudo pip install kafka-python
Download Word Embeddings (Monolingual and Bilingual) zip files. Extract and keep them in StoreWordVec/wiki for Wikipedia, StoreWordVec/news for News etc..
Get MongoDB and run the following.
- $sudo mkdir -p data/db/ (Create at $HOME directory for MongoDB database)

Get Started

Start MongoDB deamon with authentication and create admin user for all DBs.
- $sudo mongod --fork --logpath mongodb.log --auth
- $mongo
- > use admin.
- > db.createUser({user:"username",pwd:"password",roles: [{role:"userAdminAnyDatabase",db: "admin"}]}) (Create super user and password for the "admin" database).
- > exit
- $mongo -u username -p password --authenticationDatabase admin
- > use MyStore (Create Your own Database which will be used in Config file)
- > db.createUser({user:"username",pwd:"password",roles: [{role:"dbOwner",db: "MyStore"}]}) (Create Username and Password for the database).
- > exit
Update config/Config.conf as suggested in the file.
$python setup.py
Start service/collector.sh to collect data from the Kafka stream.
- $ nohup sh collector.sh &
Test if your MongoDB database collections exist and create text indexes for them.
- $mongo -u username -p password --authenticationDatabase admin
- > use MyStore
- > db.auth("username","password") ("MyStore" user authentication)
- > show collections
- > db.getCollection('YOUR_COLLECTION_NAME').ensureIndex( {Text: "text", Title: "text"}, {dropDups: true} )
- > exit
Start service/vecgenerator.sh to generate vectors for subtitles and speech to text data (yet to add for news and social media).
- $ nohup sh vecgenerator.sh &
Examples folder contains few examples on how to use different classes for tasks such as simple search, advanced search, monolingual and cross-lingual document similarity and analytics. You can use individual python files or ipython file (.ipynb) for execution.

Name		Name	Last commit message	Last commit date
Latest commit History 99 Commits
StoreWordVec		StoreWordVec
config		config
docsim		docsim
examples		examples
search		search
service		service
utils		utils
xlimedataparser		xlimedataparser
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

StoreWordVec

StoreWordVec

config

config

docsim

docsim

examples

examples

search

search

service

service

utils

utils

xlimedataparser

xlimedataparser

LICENSE

LICENSE

README.md

README.md

requirements.txt

requirements.txt

setup.py

setup.py

Repository files navigation

Dependencies

Installation Instructions (Debian/Ubuntu)

Get Started

About

Releases

Packages

Languages

License

adityamogadala/xLiMeSemanticIntegrator

Folders and files

Latest commit

History

Repository files navigation

Dependencies

Installation Instructions (Debian/Ubuntu)

Get Started

About

Resources

License

Stars

Watchers

Forks

Languages