Skip to content

thewolfe1/TheThirdEye

Repository files navigation

RTE:

1.Pycharm vs.2019.2.1
2.Python 3.6
3.16 GB RAM
4.4 GB free memory at least
5. Windows 10

Installation:

1.pip install -r requirements.txt

How to run:

Option 1 (without retraining the models):

1.Run the FinalGui.py
2.In the login option use the following:

Option 2 (with retraining the models):

this will take at least one hour to retrain bouth models:

1.Run the FinalGui.py
2.Click on register and followe the program instruction
Image 1
Image 2
Image 3

DATASET:

Word recognition data:

Source:
The data is a combination of recordings that has been generated by text to speech program in different accents and manual recordings.
Labels:
In total there are 41 labels, 40 of them are people's names and the last one is a noise label which helps us prevent false predictions.
Description:
The samples are in wav format in average length of 1 second.
There are 13 women samples and 4 men samples for each label.
Number of samples:
The number of the original samples is 680, for each label we expend the data and add 391 samples thus each label has 408 samples.
In addition we added a noise label that has 2000 samples.
The total amount of samples is 18320.
Dataset Location:
Due to large size of the data,we created a zip file it's located in this google drive link
If you want to retrain the modles you need to download the zip,and extract the folder into main folder project.
We use this dataset for training and testing and validation,it's distributed randomly:
70% training.
25% testing.
5% validation.

Speaker recognition data:

Source:
The data is a collection of manual recordings made by 2 women and 2 men and TIMID database with mixed genders.
Labels:
In total there are 5 labels, 4 of them are people's names and the last one is an unidentified label for people that are not in the system.
Description: The samples are in PNG format with an average size of 10.5 KB.
Number of samples:
For each label there are 15 samples, in total there are 75 samples.
Dataset Location:
The data is located in the speaker folder
The trainig and testing data is located in the data folder
it's distributed randomly:
70% training.
30% testing.
.

Folders explantaion:

1.speaker folder:
It contains all the audio files for speaker recognition model.
2.img data:
Contains all the spetrograms of the audio files of speaker model.
3.data:
contains 3 folders:
-train:
contains all the images for the training.
-val:
contains all the images for the testing.
-test/temp:
contains the file of live recording that is analazed in real time.

Project book:

link

Project demo video:

link

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages