Few-shot sound recognition using attentional similarity

Pytorch implementation of [Learning to match transient sound events using attentional similarity for few-shot sound recognition] (paper)

Citation

If you use this code in your research, please cite our paper.

@inproceedings{chou2019learning,
    title={Learning to match transient sound events using attentional similarity for few-shot sound recognition},
    author={Szu-Yu Chou and Kai-Hsiang Cheng and Jyh-Shing Roger Jang and Yi-Hsuan Yang},
    booktitle = {Proceedings of the International Conference on Acoustics, Speech and Signal Processing (ICASSP)},
    year={2019}
}

Requirements

Python 2.7
PyTorch 0.4.0
LibROSA 0.5.0
Cuda-9.0

Getting Started

1. Mel-spec data of (noise) ESC50 (link)

We provide the mel-spectrogram data, which extracted from wave files with default parameters showed in main.py. Once the data acquired, please unzip data.zip and have data under attentional-similarity folder.

data contains following files :

ESC_sep.npy: mel-spec of ESC50
ESC_noise_sep.npy: mel-spec of noise ESC50
ESC_tag.npy: class index of ESC_sep and ESC_noise_sep entries
ESC_tag2idx.npy: matching of class names and class indices

2. Train

2.1 Specify cuda device

It's an optional choice to modify CUDA_VISIBLE_DEVICES in Trainer.py to specify the gpu device that the model is going to run on. The default setting is 0

import os
os.environ['CUDA_VISIBLE_DEVICES'] = '0'

2.2 Start training

Run main.py script and specify the dataset name

ESC50
```
  $ python main.py --dn ESC_50
```
noise ESC50
```
  $ python main.py --dn ESC_noise_50
```

3. Test with trained model

3.1 Specify cuda device

It's an optional choice to modify CUDA_VISIBLE_DEVICES in Tester.py to specify the gpu device that the model is going to run on. The default setting is 0

import os
os.environ['CUDA_VISIBLE_DEVICES'] = '0'

3.2 Specify trained model path

Set model path pmp in main_test.py to the path that store the model you are going to test.

3.3 Start testing

Run main_test.py script and specify the dataset name

ESC50
```
  $ python main_test.py --dn ESC_50
```

noise ESC50

  $ python main_test.py --dn ESC_noise_50

License

This project is licensed under the MIT License - see the LICENSE.md file for details

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
main		main
model/ESC_50_ProtNet_att		model/ESC_50_ProtNet_att
.gitignore		.gitignore
LICENSE.md		LICENSE.md
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

main

main

model/ESC_50_ProtNet_att

model/ESC_50_ProtNet_att

.gitignore

.gitignore

LICENSE.md

LICENSE.md

README.md

README.md

Repository files navigation

Few-shot sound recognition using attentional similarity

Citation

Requirements

Getting Started

1. Mel-spec data of (noise) ESC50 (link)

2. Train

2.1 Specify cuda device

2.2 Start training

3. Test with trained model

3.1 Specify cuda device

3.2 Specify trained model path

3.3 Start testing

License

About

Releases

Packages

Languages

License

zcfan-tw/attentional-similarity

Folders and files

Latest commit

History

Repository files navigation

Few-shot sound recognition using attentional similarity

Citation

Requirements

Getting Started

1. Mel-spec data of (noise) ESC50 (link)

2. Train

2.1 Specify cuda device

2.2 Start training

3. Test with trained model

3.1 Specify cuda device

3.2 Specify trained model path

3.3 Start testing

License

About

Resources

License

Stars

Watchers

Forks

Languages