Note: Recommended to install Anaconda to manage the different environment and to avoid package/library version conflicts Download: Download Anaconda
$ conda create -n env_name python=2.7
Note: Change env_name
to your convenient name
$ conda activate env_name
Note: After successful activation of the environment terminal should display something similar to above
(env_name)$
$ git clone https://github.com/wildlytech/modular_acoustic_detection.git
$ git submodule update --init --recursive
# Make script executable
$ chmod 777 download_data_files.sh
$ ./download_data_files.sh
To install all the required library python packages at one go. Type in the command mentioned below
$ pip install -r requirements.txt
Note : Approach 1 is Preferred method as all the packages are freezed automatically here.
# Make script executable
$ chmod 777 ubuntu_packages_install.sh
# Run script to install
$ ./ubuntu_packages_install.sh
- This process is to get the audio files
(.wav format)
from YouTube which are labelled by Google - We can more details about the data-set and its annotation on the mentioned link. Google Audioset
- For getting wav files from the above mentioned source we have enter to the
get_data/
directory.
Follow the command to navigate into get_data/
$ cd get_data/
Note: Navigate to get_data/
- Audio Augmentation is a process wherein we perform operations like mixing multiple sounds, Time shift the audio data, Scale the audio, change the volume of the audio etc to make it audible differently than the original sound but also at the same making sure it is realistic
- To perform operations related augmentation navigate to augmentation/
- Follow the command in terminal to navigate to augmentation directory
$ cd augmentation/
This will download the embeddings as .pkl
files at the directory where you specify. This script requires additional functional scripts found at Tensorflow-models-repo.
$ python generating_embeddings.py --wav_file
--path_to_write_embeddings
- Embeddings in
.pkl
files for each downloaded audio file at specified directory. (--wav_file
requires the directory path where.wav
files are saved )
- This will add the generated embedding values of each audio file to base dataframe columns if it already exists
(TYPE 1)
, otherwise it creates base dataframe with["wav_file", "features"]
columns i.e(TYPE 2)
. Final dataframe will now have one extra column when compared withdownloaded_base_dataframe.pkl
i.e with["features"]
- About Arguments:
-dataframe_without_feature
: IfTYPE 1
dataframe exists already, path of it should be given otherwise it can be ignored-path_for_saved_embeddings
: Directory path where all the.pkl
files are saved-path_to_write_dataframe
: Path along with name of the file with.pkl
extension to write the final dataframe
$ python create_base_dataframe.py [-h] -dataframe_without_feature
-path_for_saved_embeddings
-path_to_write_dataframe
- If
-dataframe_without_feature
(TYPE 1)
dataframe is inputted then["features]
column is added to same dataframe, if not a new dataframe with["wav_file", "features"]
columns is stored
- This script will read the dataframe
(TYPE 1)
i.e it should have["labels_name"]
column in it , separates out the sounds based on labeling, creates a different dataframe as per labels and writes at target path given. - You can check coarse_labels.csv file to know the mapping of the labels and the separation of each sounds takes place
- To separate sounds based on labels navigate to data_preprocessing_cleaning/separating_different_sounds.py
- Follow the command below to navigate to the directory and execute the script to separate sounds
$ cd data_preprocessing_cleaning/
- Once we have required audioset consisting of different labelled audio clips
labels_name
each of 10 seconds and their appropriate embeddingsfeatures
in a dataframe format (preferably) we can use these dataframe files for training ML / DL models - Any Dataframe file with these columns in it can be included into training data i.e
["wav_file", "labels_name", "features]
- We have to add the path of that required dataframe (Includes above mentioned columns) in balanced_data.py
- Once the path of the required datframe is placed in above mentioned script navigate to
models/
to train different types of ML and DL models - To navigate follow the command below
$ cd models/
Note: Navigate to models/
- This folder consists of scripts used to predict audio files using different types of ML/DL trained models
- After training ML/DL models we will able to save the trained model weights in
.h5
file for each model. To predict any audio file we will be using these model weights file to make predictions - Navigate to
predictions/
folder to start predictng on single/multiple audio files using different models - To navigate follow the command below
$ cd predictions/
Note: Navigate to predictions/
- Definition of Audio Compression: Wikipedia Source
- Two different types of Audio compression
- Lossy Audio Compression
- Lossless Audio Compression
To perform various types of audio compression techniques & decompressing back the compressed audio files navigate to compression/
directory. To navigate follow the command below
$ cd compression/
Note: Navigate to compression/
- Definition of Goertzel Filter: Wikipedia Source
- Navigate to goertzel_filter/ directory if you want to :
- Visualize the audio file in spectrogram after applying goertzel filter
- Extract particular frequency components of an audio file
- To navigate to
goertzel_filter/
directory follow the below command
$ cd goertzel_filter/
Note: Navigate to goertzel_filter/
- About Dash Framework: Dash | Plotly
- We have used Dash framework for building local web apps for different purposes stated below
- Audio Annotation : We can annotate audio files (.wav format) in any folder present locally and save all the annotations in
.csv file
. It also enables to view spectrogram and see the model's prediction for that wavfile- To annotate audio files navigate to Dash_integration/annotation/
- Follow the command to navigate to that folder in terminal
$ cd Dash_integraion/annotation/
- Device Report : Enables to see generate a concise report for each device that is uploading files in FTP server. Device parameters such as Battery performance, Location Details etc can be visualized using this app
- To generate report navigate to Dash_integration/device_report/
- Follow the command to navigate to this folder in terminal
$ cd Dash_integraion/device_report/
- Monitoring and Alert : Enables user to monitor FTP server directories, Device(s), get alert based on detection of any sounds of interest, upload multiple audio wavfiles to see the predictions etc
- To monitor and get alerts via SMS navigate to Dash_integration/monitoring_alert/
- Follow the command in terminal to navigate to this
$ cd Dash_integration/monitoring_alert/
- Audio Annotation : We can annotate audio files (.wav format) in any folder present locally and save all the annotations in