1.Pycharm vs.2019.2.1
2.Python 3.6
3.16 GB RAM
4.4 GB free memory at least
5. Windows 10
1.pip install -r requirements.txt
1.Run the FinalGui.py
2.In the login option use the following:
- user:tal
password:123
Image 1
1.Run the FinalGui.py
2.Click on register and followe the program instruction
Image 1
Image 2
Image 3
Source:
The data is a combination of recordings that has been generated by text to speech program in different accents and manual recordings.
Labels:
In total there are 41 labels, 40 of them are people's names and the last one is a noise label which helps us prevent false predictions.
Description:
The samples are in wav format in average length of 1 second.
There are 13 women samples and 4 men samples for each label.
Number of samples:
The number of the original samples is 680, for each label we expend the data and add 391 samples thus each label has 408 samples.
In addition we added a noise label that has 2000 samples.
The total amount of samples is 18320.
Dataset Location:
Due to large size of the data,we created a zip file it's located in this google drive link
If you want to retrain the modles you need to download the zip,and extract the folder into main folder project.
We use this dataset for training and testing and validation,it's distributed randomly:
70% training.
25% testing.
5% validation.
Source:
The data is a collection of manual recordings made by 2 women and 2 men and TIMID database with mixed genders.
Labels:
In total there are 5 labels, 4 of them are people's names and the last one is an unidentified label for people that are not in the system.
Description: The samples are in PNG format with an average size of 10.5 KB.
Number of samples:
For each label there are 15 samples, in total there are 75 samples.
Dataset Location:
The data is located in the speaker folder
The trainig and testing data is located in the data folder
it's distributed randomly:
70% training.
30% testing.
.
1.speaker folder:
It contains all the audio files for speaker recognition model.
2.img data:
Contains all the spetrograms of the audio files of speaker model.
3.data:
contains 3 folders:
-train:
contains all the images for the training.
-val:
contains all the images for the testing.
-test/temp:
contains the file of live recording that is analazed in real time.