Exam Paper similarity analyzer

The project is written in Python and requires Python 2.7 to be installed. It uses a PostgreSQL background so PostgreSQL 9.4.2 or higher must be installed.

Dependencies

This project has the following dependencies:

$ pip install flask pyparsing sqlalchemy pycurl slate sklearn nltk pandas numpy scipy psycopg2

For the scraper:

$ pip install beautifulsoup

Configuration

Rename src/config.py.sample to src/config.py and update the variables inside accordingly. PAPER_DIR is a directory to download the exam papers to.

Database

Import the database dump in data/dumps/exam_papers.sql using the psql tool:

$ psql -f data/dumps/exam_papers-29112015.sql

Troubleshooting

pycurl requires libcurl to be installed.
- On debian: sudo apt-get install libcurl4-openssl-dev
pandas requires python-dev.l
nltk requires running nltk.download() to download it's files. Type d and download the stopwords dataset.
slate is broken with the latest version of it's dependency, PDFMiner. Fix it by running sudo pip install --upgrade --ignore-installed slate==0.3 pdfminer==20110515.

Running

Once you have all the dependencies installed and database running, it's simple a matter of starting the server and visiting http://localhost:5000/.

$ pwd
/downloads/ct422-project
$ cd ..
$ python -m project.src.web.api
 * Running on http://127.0.0.1:5000/ (Press CTRL+C to quit)

Note: To start the server, you must run the python command from the parent directory of the repository.

Name		Name	Last commit message	Last commit date
Latest commit History 72 Commits
data		data
design		design
install		install
notebooks		notebooks
scripts		scripts
src		src
.gitignore		.gitignore
.sublime-project		.sublime-project
README.md		README.md
__init__.py		__init__.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

data

data

design

design

install

install

notebooks

notebooks

scripts

scripts

src

src

.gitignore

.gitignore

.sublime-project

.sublime-project

README.md

README.md

init.py

init.py

Repository files navigation

Exam Paper similarity analyzer

Dependencies

Configuration

Database

Troubleshooting

Running

About

Releases

Packages

Languages

adriancooney/ct422-project

Folders and files

Latest commit

History

Repository files navigation

Exam Paper similarity analyzer

Dependencies

Configuration

Database

Troubleshooting

Running

About

Resources

Stars

Watchers

Forks

Languages