Dual Path RNN Pytorch Save

Dual-path RNN: efficient long sequence modeling for time-domain single-channel speech separation implemented by Pytorch

Project README

Dual-path-RNN-Pytorch

Dual-path RNN: efficient long sequence modeling for time-domain single-channel speech separation implemented by Pytorch

If you have any questions, you can ask them through the issue.

If you find this project helpful, you can give me a star generously.

Demo Pages: Results of pure speech separation model

Plan

2020-02-01: Reading article “Dual-path RNN: efficient long sequence modeling for time-domain single-channel speech separation”. Zhihu Article link "阅读笔记”Dual-path RNN for Speech Separation“". Blog Article link "阅读笔记《Dual-path RNN for speech separation》". Both articles are interpretations of the paper. If you have any questions, welcome to discuss with me
2020-02-02: Complete data preprocessing, data set code. Dataset Code: /data_loader/Dataset.py
2020-02-03: Complete Conv-TasNet Framework (Update /model/model.py, Trainer_Tasnet.py, Train_Tasnet.py)
2020-02-07: Complete Training code. (Update /model/model_rnn.py) and Test parameters and some details are being adjusted.
2020-02-08: Fixed the code's bug.
2020-02-11: Complete Testing code.

Dataset

We used the WSJ0 dataset as our training, test, and validation sets. Below is the data download link and mixed audio code for WSJ0.

Training

Training for Conv-TasNet model

First, you need to generate the scp file using the following command. The content of the scp file is "filename && path".

python create_scp.py

Then you can modify the training and model parameters through "config/Conv_Tasnet/train.yml".

cd config/Conv-Tasnet
vim train.yml

Then use the following command in the root directory to train the model.

python train_Tasnet.py --opt config/Conv_Tasnet/train.yml

Training for Dual Path RNN model

First, you need to generate the scp file using the following command. The content of the scp file is "filename && path".

python create_scp.py

Then you can modify the training and model parameters through "config/Dual_RNN/train.yml".

cd config/Dual_RNN
vim train.yml

Then use the following command in the root directory to train the model.

python train_rnn.py --opt config/Dual_RNN/train.yml

Inference

Conv-TasNet

You need to modify the default parameters in the test_tasnet.py file, including test files, test models, etc.

For multi-audio

python test_tasnet.py

For single-audio

python test_tasnet_wav.py

Dual-Path-RNN

You need to modify the default parameters in the test_dualrnn.py file, including test files, test models, etc.

For multi-audio

python test_dualrnn.py

For single-audio

python test_dualrnn_wav.py

Pretrain Model

Conv-TasNet

Conv-TasNet model

Dual-Path-RNN

Dual-Path-RNN model

Result

Conv-TasNet

Final Results: 15.8690 is 0.56 higher than 15.3 in the paper.

Dual-Path-RNN

Final Results: 18.98 is 0.1 higher than 18.8 in the paper.

Reference

Luo Y, Chen Z, Yoshioka T. Dual-path RNN: efficient long sequence modeling for time-domain single-channel speech separation[J]. arXiv preprint arXiv:1910.06379, 2019.
Conv-TasNet code && Dual-RNN code

Open Source Agenda is not affiliated with "Dual Path RNN Pytorch" Project. README Source: JusperLee/Dual-Path-RNN-Pytorch

Stars

385

Open Issues

Last Commit

1 year ago

Repository

JusperLee/Dual-Path-RNN-Pytorch

License

Apache-2.0

Open Source Agenda Badge

<a href="https://www.opensourceagenda.com/projects/dual-path-rnn-pytorch"><img src="https://www.opensourceagenda.com/projects/dual-path-rnn-pytorch/reviews/badge.svg" alt="Open Source Agenda"></a>

Submit Review Review Your Favorite Project

Submit Resource Articles, Courses, Videos

Submit Article Submit a post to our blog

From the blog

Dec 11, 2022

How to Choose Which Programming Language to Learn First?

From the blog

Dec 11, 2022