GitHub - Danfoa/SemEval-2012-task6-project: This repository shows an approach to address the SemEval 2012 task 6, Semantic Textual Similarity

This repository holds an approach to the SemEval2012 competition of task 6, Semantic Textual Similarity. Please note that I did not participate in the original competition and that this assignment is academically driven.

The details on the implementation are displayed in a Jupyter Notebook.

The set of features used in the end models are displayed below in a correlation matrix

In case you want to re-compute the features you need to install CoreNLP and configure it as a Server. Additionally, you need to download the Glove 300 model of your preference (download it here) and reference it in the features.py file.

The training dataset was obtained from this repository. The high accuracy obtained in this implementation relies on the fact that this augmented training dataset encapsulates the training sets from the same competition from the year 2012 to 2017.

The resultant models' performance is displayed in the Figure below. Each model uses a subset of the relevant features obtained by hand tunning or recursive feature elimination (or both, more details in the notebook).

(BoW): Stands for models using as one of its features the outcome of a regressor model trained only with BoW tf/idf embeddings

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
data		data
models		models
processed_data		processed_data
test-gold		test-gold
README.md		README.md
SemEval-2012-IHLP-UPC.ipynb		SemEval-2012-IHLP-UPC.ipynb
features.py		features.py
similarities.py		similarities.py
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

data

data

models

models

processed_data

processed_data

test-gold

test-gold

README.md

README.md

SemEval-2012-IHLP-UPC.ipynb

SemEval-2012-IHLP-UPC.ipynb

features.py

features.py

similarities.py

similarities.py

utils.py

utils.py

Repository files navigation

About

Releases

Packages

Languages

Danfoa/SemEval-2012-task6-project

Folders and files

Latest commit

History

Repository files navigation

About

Topics

Resources

Stars

Watchers

Forks

Languages