Modular Acoustic Detection

1. Environment Setup

1.1 Ubuntu Environment Setup

Note: Recommended to install Anaconda to manage the different environment and to avoid package/library version conflicts Download: Download Anaconda

Create a separate environment with python 2.7 Version

$ conda create -n env_name python=2.7

Note: Change env_name to your convenient name

Activate the created environment

$ conda activate env_name

Note: After successful activation of the environment terminal should display something similar to above

(env_name)$

1.2 Local Repository setup

Clone the Repository

$ git clone https://github.com/wildlytech/modular_acoustic_detection.git

To load all git sub modules :

$ git submodule update --init --recursive

To download all the data files :

# Make script executable
$ chmod 777 download_data_files.sh

$ ./download_data_files.sh

1.3 Python environment Setup

To install all the required library python packages at one go. Type in the command mentioned below

Approach 1:

$ pip install -r requirements.txt

Note : Approach 1 is Preferred method as all the packages are freezed automatically here.

Approach 2:

# Make script executable
$ chmod 777 ubuntu_packages_install.sh

# Run script to install
$ ./ubuntu_packages_install.sh

2. Getting Audio Data

This process is to get the audio files (.wav format) from YouTube which are labelled by Google
We can more details about the data-set and its annotation on the mentioned link. Google Audioset
For getting wav files from the above mentioned source we have enter to the get_data/ directory.

Follow the command to navigate into get_data/

$ cd get_data/

Note: Navigate to get_data/

3. Audio Augmentation

Audio Augmentation is a process wherein we perform operations like mixing multiple sounds, Time shift the audio data, Scale the audio, change the volume of the audio etc to make it audible differently than the original sound but also at the same making sure it is realistic
To perform operations related augmentation navigate to augmentation/
Follow the command in terminal to navigate to augmentation directory
$ cd augmentation/

4. Generate Embeddings

This will download the embeddings as .pkl files at the directory where you specify. This script requires additional functional scripts found at Tensorflow-models-repo.

$ python generating_embeddings.py   --wav_file
				    --path_to_write_embeddings

Output of above script will return :

Embeddings in .pkl files for each downloaded audio file at specified directory. (--wav_file requires the directory path where .wav files are saved )

5. Create Base Dataframe

This will add the generated embedding values of each audio file to base dataframe columns if it already exists (TYPE 1), otherwise it creates base dataframe with ["wav_file", "features"] columns i.e (TYPE 2). Final dataframe will now have one extra column when compared with downloaded_base_dataframe.pkl i.e with ["features"]
About Arguments:
- -dataframe_without_feature : If TYPE 1 dataframe exists already, path of it should be given otherwise it can be ignored
- -path_for_saved_embeddings : Directory path where all the .pkl files are saved
- -path_to_write_dataframe : Path along with name of the file with .pkl extension to write the final dataframe

$ python create_base_dataframe.py [-h] -dataframe_without_feature
				       -path_for_saved_embeddings
				       -path_to_write_dataframe

Output of this script will return :

If -dataframe_without_feature (TYPE 1) dataframe is inputted then ["features] column is added to same dataframe, if not a new dataframe with ["wav_file", "features"] columns is stored

6. Separating Sounds Based on Labels

This script will read the dataframe (TYPE 1) i.e it should have ["labels_name"] column in it , separates out the sounds based on labeling, creates a different dataframe as per labels and writes at target path given.
You can check coarse_labels.csv file to know the mapping of the labels and the separation of each sounds takes place
To separate sounds based on labels navigate to data_preprocessing_cleaning/separating_different_sounds.py
Follow the command below to navigate to the directory and execute the script to separate sounds
$ cd data_preprocessing_cleaning/

7. Training ML/DL Models

Once we have required audioset consisting of different labelled audio clips labels_name each of 10 seconds and their appropriate embeddings features in a dataframe format (preferably) we can use these dataframe files for training ML / DL models
Any Dataframe file with these columns in it can be included into training data i.e ["wav_file", "labels_name", "features]
We have to add the path of that required dataframe (Includes above mentioned columns) in balanced_data.py
Once the path of the required datframe is placed in above mentioned script navigate to models/ to train different types of ML and DL models
To navigate follow the command below

$ cd models/

Note: Navigate to models/

8. Predicting on Audio files using trained ML/DL models

This folder consists of scripts used to predict audio files using different types of ML/DL trained models
After training ML/DL models we will able to save the trained model weights in .h5 file for each model. To predict any audio file we will be using these model weights file to make predictions
Navigate to predictions/ folder to start predictng on single/multiple audio files using different models
To navigate follow the command below

$ cd predictions/

Note: Navigate to predictions/

9. Compressing & Decompressing of Audio Files

Definition of Audio Compression: Wikipedia Source
Two different types of Audio compression
- Lossy Audio Compression
- Lossless Audio Compression

To perform various types of audio compression techniques & decompressing back the compressed audio files navigate to compression/ directory. To navigate follow the command below

$ cd compression/

Note: Navigate to compression/

10. Goertzel Algorithm

Definition of Goertzel Filter: Wikipedia Source
Navigate to goertzel_filter/ directory if you want to :
- Visualize the audio file in spectrogram after applying goertzel filter
- Extract particular frequency components of an audio file
To navigate to goertzel_filter/ directory follow the below command

$ cd goertzel_filter/

Note: Navigate to goertzel_filter/

11. Dash User Interface Applications

About Dash Framework: Dash | Plotly
We have used Dash framework for building local web apps for different purposes stated below
- Audio Annotation : We can annotate audio files (.wav format) in any folder present locally and save all the annotations in .csv file. It also enables to view spectrogram and see the model's prediction for that wavfile
  - To annotate audio files navigate to Dash_integration/annotation/
  - Follow the command to navigate to that folder in terminal
  - $ cd Dash_integraion/annotation/
- Device Report : Enables to see generate a concise report for each device that is uploading files in FTP server. Device parameters such as Battery performance, Location Details etc can be visualized using this app
  - To generate report navigate to Dash_integration/device_report/
  - Follow the command to navigate to this folder in terminal
  - $ cd Dash_integraion/device_report/
- Monitoring and Alert : Enables user to monitor FTP server directories, Device(s), get alert based on detection of any sounds of interest, upload multiple audio wavfiles to see the predictions etc
  - To monitor and get alerts via SMS navigate to Dash_integration/monitoring_alert/
  - Follow the command in terminal to navigate to this
  - $ cd Dash_integration/monitoring_alert/

Name		Name	Last commit message	Last commit date
Latest commit History 431 Commits
Dash_integration		Dash_integration
augmentation		augmentation
blacklisted_audiofiles		blacklisted_audiofiles
compression		compression
continous_predictions		continous_predictions
data_preprocessing_cleaning		data_preprocessing_cleaning
externals		externals
get_data		get_data
goertzel_filter		goertzel_filter
ipython_notebooks		ipython_notebooks
model_configs/binary_relevance_model		model_configs/binary_relevance_model
models		models
predictions		predictions
.gitignore		.gitignore
.gitmodules		.gitmodules
LICENSE		LICENSE
README.md		README.md
__init__.py		__init__.py
balancing_dataset.py		balancing_dataset.py
create_base_dataframe.py		create_base_dataframe.py
download_data_files.sh		download_data_files.sh
find_sounds.py		find_sounds.py
generating_embeddings.py		generating_embeddings.py
launch_jupyter_notebook.sh		launch_jupyter_notebook.sh
requirements.txt		requirements.txt
ubuntu_packages_install.sh		ubuntu_packages_install.sh
xenocanto_to_dataframe.py		xenocanto_to_dataframe.py
youtube_audioset.py		youtube_audioset.py

License

aryamansriram/modular_acoustic_detection

Folders and files

Latest commit

History

Repository files navigation

Modular Acoustic Detection

1. Environment Setup

1.1 Ubuntu Environment Setup

Create a separate environment with python 2.7 Version

Activate the created environment

1.2 Local Repository setup

Clone the Repository

To load all git sub modules :

To download all the data files :