Skip to content

mengyuan2023/melody-lyrics

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 

Repository files navigation

Melody-lyric alignment data

All source URLs of the 1,000 songs for creating melody-lyric alignment data [1]

In progress (512 songs / 1,000 songs)

Description

We provide scripts for melody-lyric alignment.

Requirement

Python 2
pip install romkan
pip install jaconv
pip install jcconv

install stanford corenlp pywrapper

Japanese Morpheme Parser Mecab
Japanese Dependency Parser CaboCha
python module for MeCab and CaboCha
MeCab Dictionary ipadic and UniDic

nkf (character code converter (Shift-JIS -> UTF8))

Usage

0. Prepare dictionary files

wget http://nlp.stanford.edu/software/stanford-corenlp-full-2013-06-20.zip

Download ipadic and unidic from MeCab: Yet Another Part-of-Speech and Morphological Analyzer and UniDic.

mv unidic dic/
mv dic/dicrc dic/unidic/
mv ipadic dic/

1. Collect text and melody files

  1. Prepare lyrics.txt of the following format.
@title sample
@artist anonymous
これはサンプルです
歌詞は行と段落で構成されます

段落の間には1行の空行があります

英語が混ざっている日本語の曲も対応しています
  1. Prepare melody.ust of the following format. (See Utau - Wikipedia)

  2. Convert character code of UTAU file. (Shift-JIS -> UTF8)

nkf -w8 --overwrite melody.ust

2. Move text and melody files

mkdir pair_data   
mkdir pair_data/sample  
cp lyrics.txt pair_data/sample/sample.txt
cp melody.ust pair_data/sample/sample.ust

3. Run!

python align_data.py > data.txt

Data format

See sample data.txt


  • [1] Kento Watanabe, Yuichiroh Matsubayashi, Satoru Fukayama, Masataka Goto, Kentaro Inui and Tomoyasu Nakano. A Melody-conditioned Lyrics Language Model. In Proceedings of the 16th Annual Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL-HLT 2018)

About

All source URLs of the 1,000 songs for creating melody-lyric alignment data.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 100.0%