Skip to content

Code implementation for the paper 'Symbolic Music Genre Transfer with CycleGAN', adapted from @sumuzhao

License

Notifications You must be signed in to change notification settings

callistachang/CZ4042-CycleGAN-Music-Transfer

Repository files navigation

Music Style Transfer With CycleGANs

Built with Tensorflow 2.3.1.

Special thanks to sumuzhao and the paper Symbolic Music Genre Transfer with CycleGAN for providing the resources and inspiration required to complete this project for NTU's CZ4042 Neural Networks and Deep Learning module.

Major Changes

  • sumuzhao's implementation did not run for 2 reasons:

    1. The usage of lambda layers, which was unsupported for the implementation of instance normalization layers and residual blocks.
      • New classes InstanceNormalization and ResNetBlock extending from keras.layers.Layer were created to replace them.
      • I raised a GitHub issue about this error on the original code repository. If the author approves, I will contribute a pull request with the above changes onto the repository. (Update: The author approved my pull request here! 😄)
    2. Minor bugs in parsing command line arguments.
      • Made a few alterations to the command line parsing logic.
  • SGD and RMSprop were added as optimizer choices.

  • During CycleGAN training, discriminator, generator and cycle losses and accuracies over epochs are pickled for later examination.

  • During classifier training, test losses and accuracies over epochs are pickled for later examination.

  • During classifier testing, the test accuracies on the origin, cycle and transfer datasets are sorted and outputted to a CSV file for further examination.

Additional Scripts

  • /notebooks/visualization.ipynb was created to visualize pickled files containing the losses and accuracies over epochs during CycleGAN and classifier training.

  • /notebooks/tuning.ipynb was created to tune the hyperparameters for the CycleGAN and classifier training. The tuned hyperparameters are as follows:

    1. Standard deviation of Gaussian noise (sigma_d)
    2. Number of filters in convolutional layers (ndf and ngf)
    3. Optimizer choice (optimizer)
    4. Optimizer momentum term (beta1)
    5. Optimizer learning rate (lr)
  • /scripts/classify.py was created to test the classifier on a specified directory containing .npy music arrays.

  • /scripts/tomidi.py was created to convert a .npy music array to a .mid file.

Usage

# Train CycleGAN model
python main.py --dataset_A_dir=JC_J --dataset_B_dir=JC_C --phase=train --type=cyclegan --sigma_d=0

# Generate origin, cycle and transfer outputs with the trained CycleGAN model
python main.py --dataset_A_dir=JC_J --dataset_B_dir=JC_C --phase=test --type=cyclegan --sigma_d=0

# Train classifier model
python main.py --dataset_A_dir=JC_J --dataset_B_dir=JC_C --phase=train --type=classifier --sigma_c=0

# Test classifier model on origin, cycle and transfer outputs
python main.py --dataset_A_dir=JC_J --dataset_B_dir=JC_C --phase=test --type=classifier --sigma_c=0

# Test classifier model on a specified directory containing .npy arrays
python scripts/classify.py --classify_dir=JC_J/test

# Convert a .npy array to a MIDI file
python scripts/tomidi.py --npy_filepath=JC_J/test/jazz_piano_test_1.npy

Datasets

The jazz, classical and pop datasets can be downloaded from the zip file here.

About

Code implementation for the paper 'Symbolic Music Genre Transfer with CycleGAN', adapted from @sumuzhao

Resources

License

Stars

Watchers

Forks