========================================== SemanticTextDB - When NLP meets databases.

A database for document-storage/retrieval with automated curation and structure discovery, so that documents may be efficiently organized and queried not only based on human-labeled attributes/metadata, but also using a variety of optional automatically-inferred latent features including: semantics, topics, sentiment, eloquence, and entities of interest.

Inference of these properties is done using various statistical models and NLP algorithms stored and run inside the database.

========================================== What makes SemanticTextDB so cool?

We support augmented postgreSQL SELECT statments via the semanticSelect() API. This method provides you the power of cutting edge NLP algorithms, with no additional coding. Its as easy as:

semanticSelect(table_name, postgreSQL_SELECT_statment, NLP_feature, feature_param)

For example, we can find President Obama's approval rating given a twitter table as follows:

statement = "SELECT COUNT(*) FROM tweets WHERE content LIKE '%Barack Obama%'AND tweets.country = 'US'"

posCount = semanticSelect('twitter_text', statement, 'positive_only', 0.8)

negCount = semanticSelect('twitter_text', statement, 'negative_only', -0.8)

approval_rating = posCount / negCount #assumes negCount != 0.

========================================== Use Cases: The power of SemanticTextDB

SELECT documents by topic. (e.g. lawyers can search for laws pertaining to the topic "transportation safety.")
SELECT documents with a summary view. A short document summary allows viewing of the document query results in a concise form.
Discover population trends with sentiment analysis. (e.g. determining approval of candidates in upcoming elections)
Educational purposes - spelling correction and graded of student homework documents added to database.
Future NLP use cases. word_counts, word_frequencies, etc. can be selected for each document.

Installation

You will need Python 3.2 and pip installed.

See next two sections for server and client installation.

Server (where postgresql database is running) Installation

The postgresql server requires:

Python 3 installed
PL/Python installed. This is installed as follows in postgresql on the server:

CREATE OR REPLACE LANGUAGE plypython3u;

You can also just run this (from the client) in python using the psycopg2 library as follows:

cur = conn.cursor() #where conn is the psycopg2.connect() connection to the database cur.execute("CREATE OR REPLACE LANGUAGE plypthon3u;")

Other library dependencies:

numpy (pip install numpy)

$ [sudo] pip install numpy
scipy (pip install scipy)

$ [sudo] pip install scipy
pyscopg2 which can be installed with pip as follows:

$ [sudo] pip install psycopg2

Client Installation - clients use the SemanticTextDB library built on psycopg2 python interface driver.

Clients using SemanticTextDB requires:

psycopg2

$ [sudo] pip install psycopg2
NLTK - download within python terminal. A GUI will pop-up. Click download.

import nltk

nltk.download()
textblob (and its dependencies)

$ [sudo] pip install -U textblob
sumy (and its dependencies)

$ [sudo] pip install sumy

Using SemanticTextDB

Simply clone the repo and refer to SemanticTextDB_Tutorial.py for documentation.

With respect to viewing the tutorial, we STRONGLY recommend using iPython Notebook for viewing the SemanticTextDB_Tutorial.py. Use SemanticTextDB_Tutorial.ipynb when viewing in ipython notebook. The experience is highly enhanced.

Name		Name	Last commit message	Last commit date
Latest commit History 84 Commits
.ipynb_checkpoints		.ipynb_checkpoints
example_code		example_code
.gitignore		.gitignore
.gitmodules		.gitmodules
DocumentTableInfo.py		DocumentTableInfo.py
NLPfunctions.ipynb		NLPfunctions.ipynb
NLPfunctions.py		NLPfunctions.py
READme.md		READme.md
SemanticTextDB.py		SemanticTextDB.py
StoredProcedures.py		StoredProcedures.py
__init__.py		__init__.py
summarizer.ipynb		summarizer.ipynb
summarizer.py		summarizer.py

jwmueller/SemanticTextDB

Folders and files

Latest commit

History

Repository files navigation

========================================== SemanticTextDB - When NLP meets databases.

========================================== What makes SemanticTextDB so cool?

========================================== Use Cases: The power of SemanticTextDB

Installation

Server (where postgresql database is running) Installation

Client Installation - clients use the SemanticTextDB library built on psycopg2 python interface driver.

Using SemanticTextDB

About

Resources

Stars

Watchers

Forks

Languages