Internship project : pipeline between Stanford Core NLP and Semafor

1st may / 20 july - Guilherme RAZET, LORIA, Nancy

Setup

Clone the repository ( link);
Open a terminal in CoreNLP-Semafor-Pipeline;
Run this command : ./bin/install.sh [language] with language = [a=arabic] [c=chinese] [e=english] [f=french] [g=german] [s=spanish];
Check environnements variables in bin/config.sh;

The automatic report generation will work only if you have a LaTeX compiler (as TexLive) on your computer.

The pipeline is ready !

Place your document in the folder data (this document must be a plain text file);
Open a terminal in CoreNLP-Semafor-Pipeline;
Run this command :./bin/runSemafor.sh [input] [output] [number of threads] [language], with :
1. [input] = name of your document (exemple : test.txt);
2. [output] = path and name of your output (exemple : data/test.out.xml). CAUTION : this document must be in .xml and it must not exist;
3. [number of threads] = number of threads used in the process, usually at least 2;
4. [language] = language of the document, with [a=arabic] [c=chinese] [e=english] [f=french] [g=german] [s=spanish]. CAUTION : the language package must be download before the process, usually during the install.

This process need 2MiB of free RAM, else it will not work.

Documentation of Stanford Core NLP : click here.

Documentation of Semafor : click here.

Documentation of PyLateX : click here.

Name		Name	Last commit message	Last commit date
Latest commit History 19 Commits
Core_NLP		Core_NLP
bin		bin
data		data
dict		dict
lib		lib
scripts		scripts
src		src
target		target
tmp		tmp
training		training
venv		venv
.gitignore		.gitignore
.travis.yml		.travis.yml
LICENSE		LICENSE
LOGS.md		LOGS.md
Pipeline functionnement.pdf		Pipeline functionnement.pdf
README.md		README.md
log_pipeline.txt		log_pipeline.txt
pom.xml		pom.xml
report_generation.py		report_generation.py