DSR-Bird-Song

Live demo

What is it?

This is the repo for our Portfolio project for DSR. The goal is to train a model that can reliably classify bird species from their songs and make it available as a webservice/app. Our motivation is twofold: we want to contribute to the development of tools for automated biodiversity monitoring and provide bird enthusiasts with a handy tool.

Contributors:

Data

The bird recordings are downloaded form the Xeno-Canto database with

python data_preparation/audio_acquisition/download_files_threaded.py

Pre-processing

The audio recordings vary greatly in quality and number of species present. Assuming that the foreground species is usually the loudest in a recording we follow the methodology described in Sprengel et al., 2016 to extract signal sections from a noisy background. This script localizes spectrogram sections with amplitudes above 3 times frequency- and time-axis medians, allowing us to extract audio sections most likely containing foreground bird vocalizations. We run the script over all recordings in our storage and store the respective timestamps for signal sections in our database.

To train the model the recordings first need to be converted in spectrograms. There are different ways of doing this:

Fourier transformation: stft_s
Mel spectrogram: mel_s
Chirp spectrogram: chirp_s

A key-challenge in pre-processing is that we want to maintain flexibility in terms of spectrogram functions and parameters. Storage space is limited and the amount of available recordings vast. Thus we developed a custom implementation of the PyTorch Dataset class that makes use of a background process to dynamically load and convert audio. Ideally this process should be fast enough to preload an entire batch during training time. But a major bottleneck for audio-loading is resampling. Thus we chose to resample all files in our database to 22050hz as a first step in order to be able to load them with native sample rate later on.

For rapid model development we are currently using a small subsample of the data for which we have precomputed mel-spectrograms.

Model

We build the following models:

Bulbul: (Grill & Schlüter, 2017)
Sparrow: (Grill & Schlüter, 2017)
SparrowExp: (Schlüter, 2018)
Zipzalp: own creation by Tim
lstm: created by Satyan
Eagle: created by Satyan
Goose: created by Satyan
Robin: created by Satyan
Owl: created by Satyan
Pigeon: created by Satyan
Hawk: created by Satyan - pytorch implementation of (Pons et.al., 2018)

Model training

Configure your run in scripts/config.py

Running a job locally:

sh run.sh

Running a job on Paperspace:

paperspace jobs create --command "sh run.sh" --container "multavici/bird-song:latest" --apiKey <api-Key> --workspace "https://github.com/multavici/DSR-Bird-Song" --machineType "G1"

Enter bash in docker container with current PWD mounted:

docker run -it --mount src="$(pwd)",target=/test,type=bind multavici/bird-song /bin/bash

Name		Name	Last commit message	Last commit date
Latest commit History 672 Commits
Notebooks		Notebooks
app		app
birdsong		birdsong
tests		tests
.gitignore		.gitignore
README.md		README.md
config.py		config.py
create_extra_slices.py		create_extra_slices.py
dev_codes.csv		dev_codes.csv
dev_train.csv		dev_train.csv
dev_val.csv		dev_val.csv
manage_data.py		manage_data.py
mel_slices_test.csv		mel_slices_test.csv
mel_slices_train.csv		mel_slices_train.csv
model_BirdCLEF2017_bulbul.py		model_BirdCLEF2017_bulbul.py
top100_codes.csv		top100_codes.csv
top100_img_codes.csv		top100_img_codes.csv
top100_img_train.csv		top100_img_train.csv
top100_img_val.csv		top100_img_val.csv
top100_train.csv		top100_train.csv
top100_val.csv		top100_val.csv
train_precomputed.py		train_precomputed.py
train_precomputed_pkl.py		train_precomputed_pkl.py
train_raw_sound.py		train_raw_sound.py

stynshrm/DSR-Bird-Song

Folders and files

Latest commit

History

Repository files navigation

DSR-Bird-Song

Live demo

What is it?

Data

Pre-processing

Model

Model training

About

Resources

Stars

Watchers

Forks

Languages