Revisiting Singing Voice Detection : a Quantitative Review and the Future Outlook
This repo contains code for the paper "Revisiting Singing Voice Detection: a Quantitative Review and the Future Outlook" by Kyungyun Lee, Keunwoo Choi and Juhan Nam at the 19th International Society for Music Information Retrieval Conference (ISMIR) 2018. [pdf, blog post]
medleydb_vocal_songs.txt
.python medley_voice_label.py
to generate labels for the 61 songs.To generate dataset, run
python vibrato_data_gen.py
for vibrato test in section 5.1.python snr_data_gen.py
for SNR test in section 5.2. (Requires modification for path to MedleyDB vocal containing songs.)There are 3 reproduced models in the following folders :
lehner_randomforest
[1]schluter_cnn
[2]leglaive_lstm
[3]Commandline arguments are :
--model_name
: whatever name you set it during training, and will be saved in ./weights/
folder.--dataset
: one of {"jamendo", "vibrato", "snr"
}. New dataset can be added with modification in load_data.py
(might add RWC pop).In each model folder, audio processor to preprocess data must be run before playing around with the model.
python audio_processor.py --dataset "jamendo"
in CNN and RNN model with {"jamendo", "vibrato", "snr"
}python vocal_var.py --dataset "jamendo""
in randomforest model with {"jamendo", "vibrato", "snr"
}python main.py --model_name "mynewmodel"
python test.py --model_name "mynewmodel" --dataset "jamendo"