Steps to train a model:
- Download the urban sound datasets, and unzip it in the raw_data folder
There should be 10 sub-folders in the raw_data folder
UrbanSound8K
UrbanSound - Change the sample rate of the audio
python src/wav16000.py --raw_data_dir raw_data --data_16000_dir wav_16000
- Perfrom STFT to the audio files and save each audio as a tensor
python src/audio_pth.py --data_16000_dir wav_16000 --data_dir data
- Create train/eval/test manifest files
python src/manifest.py
- Train
python src/train.py
The best model will be store in./log/<date_time>/best.model.pth
- TODO
Correct the train/eval/test splits: AVOID COMMON PITFALLS