GitHub

@inproceedings{delbrouck-etal-2020-transformer,
    title = "A Transformer-based joint-encoding for Emotion Recognition and Sentiment Analysis",
    author = "Delbrouck, Jean-Benoit  and
      Tits, No{\'e}  and
      Brousmiche, Mathilde  and
      Dupont, St{\'e}phane",
    booktitle = "Second Grand-Challenge and Workshop on Multimodal Language (Challenge-HML)",
    month = jul,
    year = "2020",
    address = "Seattle, USA",
    publisher = "Association for Computational Linguistics",
    url = "https://www.aclweb.org/anthology/2020.challengehml-1.1",
    doi = "10.18653/v1/2020.challengehml-1.1",
    pages = "1--7"
}

Model

The model Model_AV is the module used for the UMONS solution to the MOSEI dataset using only linguistic and acoustic inputs.
Results can be replicated at the following Google Colab sheet:

Environement

Create a 3.6 python environement with:

torch              1.2.0    
torchvision        0.4.0   
numpy              1.18.1

We use GloVe vectors from space. This can be installed to your environement using the following commands :

wget https://github.com/explosion/spacy-models/releases/download/en_vectors_web_lg-2.1.0/en_vectors_web_lg-2.1.0.tar.gz -O en_vectors_web_lg-2.1.0.tar.gz
pip install en_vectors_web_lg-2.1.0.tar.gz

Data

Download data from here.
Unzip the files into the 'data' folder
More informations about the data can be found in the 'data' folder

Training

To train a Model_AV model on the emotion labels, use the following command :

python main.py --model Model_LA --name mymodel --task emotion --multi_head 4 --ff_size 1024 --hidden_size  512 --layer 4 --batch_size 32 --lr_base 0.0001 --dropout_r 0.1

Checkpoints are created in folder ckpt/mymodel

Argument task can be set to emotion or sentiment. To make a binarized sentiment training (positive or negative), use --task_binary True

Evaluation

You can evaluate a model by typing :

python ensembling.py --name mymodel

The task settings are defined in the checkpoint state dict, so the evaluation will be carried on the dataset you trained your model on.

By default, the script globs all the training checkpoints inside the folder and ensembling will be performed.

Results:

Results are run on a single GeForce GTX 1080 Ti.
Training performances:

Modality	Memory Usage	GPU Usage	sec / epoch	Parameters	Checkpoint size
Linguistic + acoustic	320 Mb	2400 MiB	103	~ 33 M	397 Mb
Linguistic + acoustic + vision

You should approximate the following results :

Task Accuracy	val	test	test ensemble	epochs
Sentiment-7	43.61	43.90	45.36	6
Sentiment-2	82.30	81.53	82.26	8
Emotion-6	81.21	81.29	81.48	3

Ensemble results are of max 5 single models
7-class and 2-class sentiment and emotion models have been train according to the instructions here.

Pre-trained checkpoints:

Result Sentiment-7 ensemble is obtained from these checkpoints : Download Link
Result Sentiment-2 ensemble is obtained from these checkpoints : Download Link
Result Emotion ensemble is obtained from these checkpoints : Download Link

Name		Name	Last commit message	Last commit date
Latest commit History 88 Commits
data		data
error_analysis		error_analysis
exploration		exploration
interpretability		interpretability
layers		layers
multilogue-net		multilogue-net
multimodal-transformer		multimodal-transformer
utils		utils
.gitignore		.gitignore
README.md		README.md
ensembling.py		ensembling.py
main.py		main.py
meld_dataset.py		meld_dataset.py
model_LA.py		model_LA.py
model_LAV.py		model_LAV.py
mosei_dataset.py		mosei_dataset.py
net.py		net.py
train.py		train.py

ShengruiLYU/MOSEI_UMONS

Folders and files

Latest commit

History

Repository files navigation

Model

Environement

Data

Training

Evaluation

Results:

Pre-trained checkpoints:

About

Resources

Stars

Watchers

Forks

Languages