Spectral Autoencoder

This is a Keras implementation of a variational version of the baseline spectral autoencoder described in google deepmind paper .

You can have a look at our results here:

Audio generation with VAE: https://www.youtube.com/watch?v=I7eWJuqg3zU

Dataset

We used a subset of the public Nsynth dataset composed by brasses and flutes. We got the log-magnitude spectra of each audio and we used them as input/target during training process. As mentioned in the original article we used Griffin & Lim algorithm to reconstruct the phase of each signal.

Implementation

We implemented a variational version of the baseline autoencoder to see if a meaningful audio generation was possible in this case. In order to reduce the huge number of parameters of the original model we achieved a dimensionality reduction of the filters with respect of it. Even in this case the phase was reconstructed using Griffin & Lim algorithm

License

The code is released under the terms of MIT license.

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
hypercube_side4		hypercube_side4
LICENSE		LICENSE
README.md		README.md
audio_util.py		audio_util.py
baseline_vae.py		baseline_vae.py
rainbowgram.py		rainbowgram.py
rec.wav		rec.wav

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

hypercube_side4

hypercube_side4

LICENSE

LICENSE

README.md

README.md

audio_util.py

audio_util.py

baseline_vae.py

baseline_vae.py

rainbowgram.py

rainbowgram.py

rec.wav

rec.wav

Repository files navigation

Spectral Autoencoder

Dataset

Implementation

License

About

Releases

Packages

Languages

License

kinik93/Audio-generation-with-VAE

Folders and files

Latest commit

History

Repository files navigation

Spectral Autoencoder

Dataset

Implementation

License

About

Resources

License

Stars

Watchers

Forks

Languages