Dual-path RNN: efficient long sequence modeling for time-domain single-channel speech separation implemented by Pytorch
Dual-path RNN: efficient long sequence modeling for time-domain single-channel speech separation implemented by Pytorch
If you have any questions, you can ask them through the issue.
If you find this project helpful, you can give me a star generously.
Demo Pages: Results of pure speech separation model
2020-02-01: Reading article “Dual-path RNN: efficient long sequence modeling for time-domain single-channel speech separation”. Zhihu Article link "阅读笔记”Dual-path RNN for Speech Separation“". Blog Article link "阅读笔记《Dual-path RNN for speech separation》". Both articles are interpretations of the paper. If you have any questions, welcome to discuss with me
2020-02-02: Complete data preprocessing, data set code. Dataset Code: /data_loader/Dataset.py
2020-02-03: Complete Conv-TasNet Framework (Update /model/model.py, Trainer_Tasnet.py, Train_Tasnet.py)
2020-02-07: Complete Training code. (Update /model/model_rnn.py) and Test parameters and some details are being adjusted.
2020-02-08: Fixed the code's bug.
2020-02-11: Complete Testing code.
We used the WSJ0 dataset as our training, test, and validation sets. Below is the data download link and mixed audio code for WSJ0.
python create_scp.py
cd config/Conv-Tasnet
vim train.yml
python train_Tasnet.py --opt config/Conv_Tasnet/train.yml
python create_scp.py
cd config/Dual_RNN
vim train.yml
python train_rnn.py --opt config/Dual_RNN/train.yml
You need to modify the default parameters in the test_tasnet.py file, including test files, test models, etc.
python test_tasnet.py
python test_tasnet_wav.py
You need to modify the default parameters in the test_dualrnn.py file, including test files, test models, etc.
python test_dualrnn.py
python test_dualrnn_wav.py
Final Results: 15.8690 is 0.56 higher than 15.3 in the paper.
Final Results: 18.98 is 0.1 higher than 18.8 in the paper.