VGG16-SNU-B36-50

This project is for classification of inter-floor noise (SNU-B36-50) in a building using VGG16
VGG16 is fine-tuned on SNU-B36-50 without freezing any weights
The model is evaluated using 5-fold cross-validation
The following confusion matrix shows the evaluation results

Notice

VGG16-SNU-B36-50 will be merged into indoor-noise repository in near future

Requirements

Python (version 3.5.2)
Python modules : TensorFlow (version 1.2), Numpy, Scipy, Pandas, matplotlib, librosa, and Pickle
Pretrained weights of VGG16

audio: This folder includes inter-floor noises (SNU-B36-50) for training and validation
dataset: This folder includes metadata. Also, when the audio clips are converted to log scaled Mel-spectrograms they are saved into this folder
result: Cross-validation accuracy and confusion matrix are saved into this folder
cfmtx.py: This includes a confusion matrix drawing function
feature.py: This includes melspec2 a function which converts the audio clips to log scaled Mel-spectrograms using LibROSA
gen_data.py: This reads the metadata and the audio clips are converted to log scaled Mel-spectrograms using feature.py. The Mel-spectrograms are saved as .p and saved in "dataset"
load_data.py: This can load training data and validation data (Currently, this supports batch and mini-batch)
vgg16_adap.py: This builds network architecture of VGG16 with an adaptation layer. Also, this supports save_weights() which saves weights as .npz after a training
main.py: You need to set several parameters. The current settings in the code are for IWAENC 2018 submission
- gpu_device : Select a gpu device
- is_transfer_learn : Transfer the pretrained weights or not
- gen_data: If this is TRUE, then the audio clips are converted to log scaled Mel-spectrograms and they are saved as .p
- freeze_layer
  - True: Freeze the weights except fc3w, fc3b, fc4w, and fc4b
  - False: Do not freeze all the weights
- bn: If this is TRUE, turn on batch normalization
- saver: If it is TURE, the weights at the last epoch are saved as .npz
- fold: Use k-th subsample as the validation set
results: Training loss, validation loss, training accuracy, and validation accuracy are saved to here

Quick start

Clone this project git clone https://github.com/yodacatmeow/VGG16-SNU-B36-50
Download the pretrained weights to this project path
Start a process CUDA_VISIBLE_DEVICE=0 python3 main.py

Citing

@inproceedings{choi2018floornoise,
  title={Classification of noise between floors in a building using pre-trained deep convolutional neural networks},
  author={Choi, Hwiyong and Lee, Seungjun and Yang, Haesang and Seong, Woojae},
  booktitle={2018 16th International Workshop on Acoustic Signal Enhancement (IWAENC)},
  pages={535--539},
  year={2018},
  organization={IEEE}
}

Name		Name	Last commit message	Last commit date
Latest commit History 20 Commits
audio		audio
dataset		dataset
figure		figure
result		result
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
cfmtx.py		cfmtx.py
feature.py		feature.py
gen_data.py		gen_data.py
load_data.py		load_data.py
main.py		main.py
snub36_50_category.py		snub36_50_category.py
snub36_50_category_num.py		snub36_50_category_num.py
test.png		test.png
test.py		test.py
tsne_conv.py		tsne_conv.py
tsne_input.py		tsne_input.py
vgg16_adap.py		vgg16_adap.py

License

yodacatmeow/VGG16-SNU-B36-50

Folders and files

Latest commit

History

Repository files navigation

VGG16-SNU-B36-50

Notice

Requirements

Contents

Quick start

Citing

About

Topics

Resources

License

Stars

Watchers

Forks

Languages