Skip to content

Keras implementation of a variational version of baseline spectral autoencoder.

License

Notifications You must be signed in to change notification settings

kinik93/Audio-generation-with-VAE

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Spectral Autoencoder

This is a Keras implementation of a variational version of the baseline spectral autoencoder described in google deepmind paper .

You can have a look at our results here:

  • Audio generation with VAE: https://www.youtube.com/watch?v=I7eWJuqg3zU

    Dataset

    We used a subset of the public Nsynth dataset composed by brasses and flutes. We got the log-magnitude spectra of each audio and we used them as input/target during training process. As mentioned in the original article we used Griffin & Lim algorithm to reconstruct the phase of each signal.

    Implementation

    We implemented a variational version of the baseline autoencoder to see if a meaningful audio generation was possible in this case. In order to reduce the huge number of parameters of the original model we achieved a dimensionality reduction of the filters with respect of it. Even in this case the phase was reconstructed using Griffin & Lim algorithm

    License

    The code is released under the terms of MIT license.

  • About

    Keras implementation of a variational version of baseline spectral autoencoder.

    Resources

    License

    Stars

    Watchers

    Forks

    Releases

    No releases published

    Packages

    No packages published

    Languages