What's this project about?

The goal if this project is to create a multi-modal Speech Emotion Recogniton system on IEMOCAP dataset.

Project outline

Feb 2019 - IEMOCAP dataset aquisition and parsing
Mar 2019 - Baseline of linguistic model
Apr 2019 - Baseline of acoustic model
May 2019 - Integration and optimiaztion of both models
Jun 2019 - Integration with open-source ASR(most likely DeepSpeech)

What's IEMOCAP dataset?

IEMOCAP states for Interactive Emotional Dyadic Motion and Capture dataset. It is the most popular database used for multi-modal speech emotion recognition.

Original class distribution:

IEMOCAP database suffers from major class imbalance. To solve this problem we reduce the number of classes to 4 and merge Enthusiastic and Happiness into one class.

Final class distribution

Related works overview

References: [1] [2] [3] [4] [5] [6] [7] [8] [9]

System Architecture

Results so far

Model	Accuracy	Unweighted Accuracy	Loss
Acoustic	0.602	0.601	0.983
Linguistic	0.642	0.638	0.913
Ensemble (highest confidence)	0.699	0.704	0.827
Ensemble (average)	0.711	0.708	0.948
Ensemble (weighted average)	0.716	0.712	0.944

Confusion matrix of the best model

loss: 0.944, acc: 0.716. unweighted acc: 0.712, conf_mat: 
[[291.  60.  31.   9.]
 [ 88. 282.  17.   6.]
 [ 46.  19. 191.   2.]
 [ 61.  26.   4. 167.]]

*classes in order: [Neutral, Happiness, Sadness, Anger]
*row - correct class, column - prediction

Name		Name	Last commit message	Last commit date
Latest commit History 107 Commits
assets		assets
iemocap_utils		iemocap_utils
.gitignore		.gitignore
README.md		README.md
audio_preprocessing.py		audio_preprocessing.py
batch_iterator.py		batch_iterator.py
config.py		config.py
confusion_matrix.py		confusion_matrix.py
data_loader.py		data_loader.py
deepspeech_generator.py		deepspeech_generator.py
evaluate_models.py		evaluate_models.py
hyperparameter_tuning.py		hyperparameter_tuning.py
live_demo.py		live_demo.py
model_utils.py		model_utils.py
models.py		models.py
requirements.txt		requirements.txt
text_preprocessing.py		text_preprocessing.py
train.py		train.py
train_ensemble.py		train_ensemble.py
utils.py		utils.py
vis.py		vis.py
word2vec_wrapper.py		word2vec_wrapper.py

Jackli95/speech-emotion-recognition

Folders and files

Latest commit

History

Repository files navigation

What's this project about?

Project outline

What's IEMOCAP dataset?

Related works overview

System Architecture

Results so far

Confusion matrix of the best model

About

Resources

Stars

Watchers

Forks

Languages