Skip to content

GunioRobot/SiLCC

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

SiLCC

===========
Synopsis
===========
SiLCC aims to provide a learning NLP based relevancy filter/tagger/classifier that is 
applied to the various information sources that Swift aggregates (RSS, twitter, SMS). 

============
Installation
============

To set up a silcc server instance:

1) Create a Python virtualenv

2) In the this distribution run:

$ python setup.py develop

This should install all needed libraries in your virtual environment.

3) You also need to install nltk. Download it from the following location:

$ wget http://nltk.googlecode.com/files/nltk-2.0b8.zip
$ unzip nltk-2.0b8.zip
$ cd nltk-2.0b8.zip

Now make sure you have activated the virtualenv created in step 1.

If you are sure its activated, issue the following to install nltk into
your virtualenv.

$ python setup.py install

4) After installing nltk you need to download one
of the Treebank Corpus for the part-of-speech tagger to work:

$ python
>>> import nltk
>>> nltk.download()

This will take you into an interactive download environment.
The corpus you want to download is: maxent_treebank_pos_tagger
After downloading it quit python.

5) Create the database. If using MySQL:

$ mysql -u root -p

Once in the mysql environment:

>>> create database silcc default charset utf8;
>>> grant all on silcc.* to silcc@localhost identified by 'password';

Make sure the .ini file contains a DB URI to the above database.

(Please see instructions below on using SQLAlchemy migrate
scripts to generate the db schema)

6) To start the server:

$ paster serve --daemon prod.ini 

Look for any errors in paster.log

========
Placing your SiLCC Instance DB under version controll.
========
You may want to place your database under version control so 
that you can easily upgrade the schema, should it evolve over time.

To do this you need to first create an
empty database, and then place it under version control.

1) First install sqlalchemy-migrate:

$ easy_install sqlalchemy-migrate

2) now place your database under version control with:

$ python db_repository/manage.py version_control  mysql://silcc:silcc@localhost:3306/silcc

This will create the migrate_version table in your database and set the
initial version to 0.

After that you may run any schema upgrade scripts including the
first one which will create the necessary tables.

Most data in the SiLLC distribution can be recreated using scripts and
data dumps provided.

3) To make upgrading your local instance easier its best to create a manage.py
scripts as follows:

$ migrate manage manage.py --repository=db_repository --url=mysql://silcc:silcc@localhost:3306/silcc

Since the manage.py will be specific to your local instance it is not included in the SiLCC repo.

4) Now run a db update to generate the initial schema:

$ python manage.py update

5) The Web Service API needs to have one or more valid keys in the apikey table.
For testing purposes add a key with a valid_domains value of '*' (valid from all domains)

About

RESTful semantic tag extraction application for SMS, Twitter and Text

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published