Skip to content

sharifahmad2061/looking-to-listen

 
 

Repository files navigation

Looking to Listen

This is implementation of "Looking to Listen at the Cocktail Party" by python3 and chainer. This deep learning technology can be applied to noise reduction, removal of background music, and speech separation.

Original paper is here (arxiv.org/abs/1804.03619). Note that this implementation is inspired by crystal-method (MIT).

Quick Start Demonstration (Audio-only Noise Reduction)

We show demonstration of noise reduction using pretrained model.

  1. First, you need build docker container.
$ docker-compose build
  1. Put the noisy audio file(s) to ./data/noise.

  2. Run following command.

  • GPU
$ docker-compose run network python3 quick_start_audio_only.py /data/model/0f_1sclean_noise.npz /data/noise
  • CPU (comment out _set_gpu() in network/src/env.py)
Intel CPU (Fast)
$ docker-compose run network python3 quick_start_audio_only.py /data/model/0f_1sclean_noise.npz /data/noise -ideep
Other CPU (Slow)
$ docker-compose run network python3 quick_start_audio_only.py /data/model/0f_1sclean_noise.npz /data/noise
  1. We can get clean audio in ./data/results.

Usage

Please refer to the following section for additional information such as speech separation and audio-visual processing.

Open in bash

$ docker-compose run preprocess bash
$ docker-compose run dataset bash
$ docker-compose run network bash

Differences from original paper

The original paper has a large FC layer. However, there is not enough memory to put this network on the GPU. In this implementation, the size of the FC layer is reduced so that a network can be installed in a single GPU.

External Libraries

We use external libraries in preprocess/src/libs.

About

Deep neural network (DNN) for noise reduction, removal of background music, and speech separation

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 95.5%
  • Dockerfile 4.3%
  • Shell 0.2%