A Flask-based back end for WordSeer

WordSeer is a tool for natural language analysis of digital corpora.

There are two parts to this repository.

A rewrite of the original implementation of wordseer into python from PHP.

This is located in app/wordseer/. It is the server-side and web interface code for the WordSeer application, written in Python using the Flask framework and several web framework libraries.
An implementation of wordseerbackend in more maintainable python.

This is located in app/preprocessor/. It is the pipeline or preprocessing code for uploaded data sets.

Installation

The following packages must be installed before performing any setup:

Run install.py like so:

./install.py -i

This will launch the interactive installer which will guide you through the simple installation process.

If you know what you want, run install.py -h to view known console flags.

We also recommend installing the python dependencies (discussed below) in a virtual environment.

pip install virtualenv
virtualenv venv
source venv/bin/activate

Run:
```
pip -r install requirements_win.txt
```
to install the necessary packages.
Run:
```
python database.py create
```
to create the dabase, and
```
python database.py migrate
```
to migrate the model schema into the database.

corenlp must be installed manually. Clone the repository:

 git clone https://github.com/silverasm/stanford-corenlp-python.git
 cd stanford-corenlp-python
 python setup.py install

This should install corenlp to your system.

In order to complete the setup, version 3.2.0 of Stanford's CoreNLP library must simply be in a directory accessible to the backend. Download this file and move it to the root of the repository. Extract it and rename the folder from stanford-corenlp-full-2013-06-20 to stanford-corenlp.
If you followed the above directions, then you shouldn't need to worry about any configuration. If you installed Stanford's CoreNLP elsewhere, then make sure you edit lib/wordseerbackend/wordseerbackend/config.py for your setup. Particularly make sure to point CORE_NLP_DIR to the Stanford NLP library.
Run the following command in the console:
```
 python -m nltk.downloader punkt
```
You should then be ready to parse files. Example XML and JSON files are included in tests/data.

Documentation is available on readthedocs. You can also build it yourself:

cd docs/
make html

Or, on windows, simply run make.bat in the same directory.

Simply run runtests.py:

python runtests.py

Name		Name	Last commit message	Last commit date
Latest commit History 1,215 Commits
.ebextensions		.ebextensions
app		app
docs		docs
tests		tests
uploads		uploads
.gitignore		.gitignore
.travis.yml		.travis.yml
CONTRIBUTING.markdown		CONTRIBUTING.markdown
README.md		README.md
config.py		config.py
database.py		database.py
install.py		install.py
logging.json		logging.json
requirements.txt		requirements.txt
requirements_min.txt		requirements_min.txt
runpreprocessor.py		runpreprocessor.py
runtests.py		runtests.py
setup.cfg		setup.cfg
wordseer.py		wordseer.py