Espnet Versions Save

End-to-End Speech Processing Toolkit

v.0.10.6

2 years ago

New Features

  • [New Features][ESPnet2][TTS][Installation][README] [TTS] Support python-based toolkit for xvector extractors #4016 by @Fhrozen
  • [New Features][ESPnet2] Add SpecAug2 which supports variable maximum width in time masking #3902 by @pyf98

Recipe

  • [Recipe][ESPnet1][ASR] Add librispeech-100h recipe #3997 by @YosukeHiguchi
  • [Recipe][ESPnet1][ASR] Update egs/librispeech_100 #4036 by @YosukeHiguchi
  • [Recipe][ESPnet2][ASR][README] Scoring Mandarin / English separately for the SEAME corpus #3976 by @vectominist
  • [Recipe][ESPnet2][ASR][README] update LibriSpeech Pretrained models with SSLRs: results and huggingf… #3979 by @simpleoier
  • [Recipe][ESPnet2][ASR][README][ST] Speech translation framework (merging into master) #3987 by @ftshijt
  • [Recipe][ESPnet2][ASR][TTS] Update two recipes (googlei18n and hub4_spanish) #3895 by @ftshijt
  • [Recipe][ESPnet2][SLU][README] updated the results of Slue voxceleb #3929 by @siddhu001
  • [Recipe][ESPnet2][ST] Update the default setting for st #3993 by @ftshijt

Bugfix

  • [Bugfix][ESPnet1][RNNT] Fix bug for Conformer-T #4020 by @YosukeHiguchi
  • [Bugfix][ESPnet2][Diarization] Diarization: fix for convolutional input layer in the encoder #3957 by @alumae
  • [Bugfix][ESPnet2][Diarization] Two fixes to diarization evaluation scripts #3938 by @alumae
  • [Bugfix][ESPnet2][Diarization][Recipe] Fix issues in EEND-EDA & add Librimix_diar recipe #3900 by @YushiUeda
  • [Bugfix][ESPnet2][ESPnet1][ASR][streaming] streaming conformer bugfix #4025 by @jeon30c
  • [Bugfix][ESPnet2][LM] Bugfix for espnet2 ngram #4002 by @yaochie
  • [Bugfix][ESPnet2][RNNT] espnet2 asr inference bugfix for transducer #3943 by @jeon30c
  • [Bugfix][ESPnet2][ST] Bugfix for ST scoring #3972 by @ftshijt

Enhancement

  • [Enhancement][ESPnet2] cleaned tensorboard and stats logging for espnet2 #3910 by @siddalmia
  • [Enhancement][ESPnet2][Diarization] Add test codes for diarization #3953 by @YushiUeda
  • [Enhancement][ESPnet2][streaming] Add reference for streaming ASR #4014 by @D-Keqi

Ohter

  • [CI] remove the support of pytorch 1.3.1 #4038 by @sw005320
  • [CI][ESPnet1][ESPnet2] fix ci for librosa update #4043 by @ftshijt
  • [CI][Installation] Fix numpy version #3965 by @kan-bayashi
  • [CI][Installation] temporary fixed pypinyin version #3995 by @kan-bayashi
  • [Documentation][ESPnet1][ESPnet2][README][SLU] Add Sinhala E2E SLU Recipe #3890 by @karthik19967829
  • [Documentation][README] Update README.md #4039 by @sw005320
  • [ESPnet2][README] Update README.md #3931 by @sw005320
  • [ESPnet2][README][TTS][Typo] Fix typo in README.md #4024 by @kan-bayashi

Acknowledgements

Special thanks to @D-Keqi, @Fhrozen, @YosukeHiguchi, @YushiUeda, @alumae, @ftshijt, @jeon30c, @kan-bayashi, @karthik19967829, @pyf98, @siddalmia, @siddhu001, @simpleoier, @sw005320, @vectominist, @yaochie.

Full Changelog

https://github.com/espnet/espnet/compare/v.0.10.5...v.0.10.6

v.0.10.5

2 years ago

New Features

  • [New Features][ESPnet1][ASR] Implement self-conditioned CTC #3856 by @komatta-san
  • [New Features][ESPnet2][ASR][CI][Installation] GTN CTC for ESPnet2 #3778 by @brianyan918
  • [New Features][ESPnet2][ASR][Refactoring] [ESPnet2] Transducer #2533 by @b-flo
  • [New Features][ESPnet2][README][Recipe] Frontends fusion (any type, any number, linear fusion only for now) for ASR in espnet2 #3824 by @DanBerrebbi
  • [New Features][ESPnet2][SE] Refactor loss computation in enhancement tasks. #3838 by @LiChenda

Recipe

  • [Recipe][ESPnet1][ESPnet2][ASR][README] updated the results of aidatatang_200zh #3925 by @sw005320
  • [Recipe][ESPnet1][VC] Various fixes of voice conversion recipes #3800 by @unilight
  • [Recipe][ESPnet2][ASR][README] Expanding egs2 of Tedlium2 #3795 by @D-Keqi
  • [Recipe][ESPnet2][ASR][README] Update an4 config #3913 by @pyf98
  • [Recipe][ESPnet2][ASR][README] aidatatang_200zh recipe #3892 by @sw005320
  • [Recipe][ESPnet2][README] Update README.md #3881 by @daisylab
  • [Recipe][ESPnet2][README] Update egs2/TEMPLATE/README.md #3793 by @kamo-naoyuki
  • [Recipe][ESPnet2][README] fix readme #3827 by @seastar105
  • [Recipe][ESPnet2][README][Recipe] Add ASR Recipe: Primewords_Chinese #3903 by @pyf98
  • [Recipe][ESPnet2][README][Recipe] Update MISP challenge ASR baseline and add AVSR baseline #3819 by @neillu23
  • [Recipe][ESPnet2][README][SLU] Fsc Maseeval scripts #3769 by @siddhu001
  • [Recipe][ESPnet2][README][SLU] Update Google Speechcommands (SLU recipe) #3915 by @pyf98
  • [Recipe][ESPnet2][README][TTS] ESPnet2 ARCTIC TTS #3791 by @peter-yh-wu
  • [Recipe][ESPnet2][README][TTS] Update README and add missing config #3917 by @kan-bayashi
  • [Recipe][ESPnet2][Recipe][SLU] Slue voxceleb Sentiment Analysis #3894 by @siddhu001
  • [Recipe][ESPnet2][SE] modified data type in enh.sh #3768 by @simpleoier

Bugfix

  • [Bugfix][ESPnet1][README][RNNT] Fix cache for Transducer search strategies + doc #3869 by @b-flo
  • [Bugfix][ESPnet1][RNNT] Fix recombine_hyps #3908 by @b-flo
  • [Bugfix][ESPnet1][RNNT] fix rnn-t ALSD beam search index bug #3794 by @maxwellzh
  • [Bugfix][ESPnet1][RNNT] fix the sort order in select_k_expansions() #3864 by @freewym
  • [Bugfix][ESPnet2] Bug fix for .gitignore and db fill up for CMU cluster #3891 by @siddalmia
  • [Bugfix][ESPnet2] Fix #3716 #3849 by @kan-bayashi
  • [Bugfix][ESPnet2] Merging asr_streaming.sh into asr.sh for laborotv egs2 #3868 by @D-Keqi
  • [Bugfix][ESPnet2] add init.py #3928 by @sw005320
  • [Bugfix][ESPnet2] fix small problem that used before defined in step 12 #3871 by @simpleoier
  • [Bugfix][ESPnet2] fix stft olens when win_lengths is not equal to n_fft #3812 by @IceCreamWW
  • [Bugfix][ESPnet2] update s3prl frontend w.r.t. recent modification in s3prl interface #3839 by @simpleoier
  • [Bugfix][ESPnet2][TTS] bugfix lang2lid in tts.sh #3906 by @imdanboy
  • [Bugfix][Installation] Fix #3783 #3786 by @kamo-naoyuki

Others

  • [CI] Fix G2P test failure in CI due to the dict update #3848 by @kan-bayashi
  • [CI][Documentation][ESPnet1][ESPnet2] Fixing issues about streaming Transformer/Conformer training #3880 by @D-Keqi
  • [CI][ESPnet1][ESPnet2][Installation][New Features][README] nbest rescoring with k2 #3567 by @glynpu
  • [Documentation][README] Update README.md #3893 by @sw005320
  • [Documentation][README][SSL] Add more docs about s3prl frontend #3796 by @simpleoier
  • [Documentation][README][streaming] Updating main README.md about streaming transformer #3855 by @D-Keqi
  • [ESPnet1][RNNT] Add exception for conformer decoder #3801 by @b-flo
  • [ESPnet2][README][Typo] Fix typo in README.md #3852 by @kan-bayashi
  • [ESPnet2][SE] add eps in beam-forming reference channel selection #3904 by @LiChenda
  • [ESPnet2][SLU] Add unit test for score_intent.py #3759 by @siddhu001
  • [ESPnet2][ST] Speech Translation Update #3860 by @ftshijt
  • [ESPnet2][TTS][Installation][Refactoring] Refactor Phonemizer-based G2P #3916 by @kan-bayashi

Acknowledgements

Special thanks to @D-Keqi, @DanBerrebbi, @IceCreamWW, @LiChenda, @b-flo, @brianyan918, @daisylab, @freewym, @ftshijt, @glynpu, @imdanboy, @kamo-naoyuki, @kan-bayashi, @komatta-san, @maxwellzh, @neillu23, @peter-yh-wu, @pyf98, @seastar105, @siddalmia, @siddhu001, @simpleoier, @sw005320, @unilight.

v.0.10.4

2 years ago

New Features

  • [New Features][ESPnet1][ESPnet2][ASR][README] The code for Emiru's real streaming Transformer #3614 by @D-Keqi
  • [New Features][ESPnet1][MT][ST][Installation] Support sacreBLEU #3698 by @hirofumi0810
  • [New Features][ESPnet2][ST] ESPNet2 speech translation #3587 by @ftshijt

Enhancement

  • [Enhancement][ESPnet1][ASR] Fix e2e_asr_maskctc.py to make RTF computable #3634 by @eddiewng
  • [Enhancement][ESPnet2][Installation][README] HuggingFace Upload support for ESPnet2 tasks [cont.] #3677 by @Fhrozen
  • [Enhancement][ESPnet2][TTS][Installation] Add korean_jaso tokenizer and korean_cleaner #3588 by @windtoker

Bugfix

  • [Bugfix][ESPnet1][ASR][RNNT] Fix quantization for Transducer #3616 by @b-flo
  • [Bugfix][ESPnet2][ASR][Recipe] added download test set, small modifications for path of aishell #3663 by @teinhonglo
  • [Bugfix][ESPnet2] Do stft with librosa when neither MKL nor CUDA is available. #3668 by @CTinRay
  • [Bugfix][ESPnet2] [bug fixed] allow adding noise independently of rir, bug fixed in #3692 by @ranchlai
  • [Bugfix][ESPnet2][Recipe] Create Symlinks for 1-channel/2-channel tracks in chime4 #3699 by @neillu23
  • [Bugfix][ESPnet2][Recipe] Fix SWBD Data Prep Bug #3742 by @brianyan918

Recipe

  • [Recipe][ESPnet1][ASR][MT][ST] Add CoVoST2 recipe #3720 by @hirofumi0810
  • [Recipe][ESPnet2][ASR][README] MISP2021 E2E ASR Baseline #3738 by @neillu23
  • [Recipe][ESPnet2][ASR][README] Wenetspeech #3686 by @pengchengguo
  • [Recipe][ESPnet2][SLU] Add snips hubert feature training #3619 by @yuekaizhang
  • [Recipe][ESPnet2][SLU] Make scoring part more general #3715 by @siddhu001
  • [Recipe][ESPnet2][SLU][README] Add ESPnet-SLU Recipe: Google Speech Commands #3693 by @pyf98
  • [Recipe][ESPnet2][SLU][README] Add an ESPnet2 recipe for the Grabo SLU dataset #3669 by @pyf98
  • [Recipe][ESPnet2][SLU][README] CATSLU-MAPS: Added recipe #3685 by @SujaySKumar
  • [Recipe][ESPnet2][SLU][README] ESPnet2 Japanese dialogue act classification recipe #3667 by @YushiUeda
  • [Recipe][ESPnet2][SLU][README] Slurp SLU with bpe encoded transcripts #3674 by @siddhu001
  • [Recipe][ESPnet2][SLU][README] Slurp entity classification #3739 by @siddhu001
  • [Recipe][ESPnet2][SSL] Add eps in acc computation of HuBERT model #3713 by @simpleoier
  • [Recipe][ESPnet2][TTS] Change the timing of srctexts creation #3734 by @kan-bayashi
  • [Recipe][ESPnet2][TTS][README] update kss recipe with VITS configuration #3660 by @windtoker

Others

  • [CI][ESPnet2][Installation] Fix tests in CI #3700 by @kan-bayashi
  • [CI][ESPnet2][SLU][README] Add Hubert pretrained ASR in FSC SLU #3653 by @siddhu001
  • [CI][Installation] Minor update for CI #3656 by @kan-bayashi
  • [Documentation][ESPnet1][README][RNNT][Refactoring] Refactor custom Transducer build #3697 by @b-flo
  • [Documentation][ESPnet2][README] Hugging Face support - Doc [cont.] #3709 by @Fhrozen
  • [Installation] Update pyopenjtalk version #3733 by @kan-bayashi
  • [README] Huggingface spaces ESPnet2-TTS web demo #3673 by @AK391
  • [README][ESPnet2] Add Huggingface model documentation #3714 by @siddhu001
  • [README][ESPnet2] Fix readme #3750 by @takenori-y

Acknowledgements

Special thanks to @AK391, @CTinRay, @D-Keqi, @Fhrozen, @SujaySKumar, @YushiUeda, @b-flo, @brianyan918, @eddiewng, @ftshijt, @hirofumi0810, @kan-bayashi, @neillu23, @pengchengguo, @pyf98, @ranchlai, @siddhu001, @simpleoier, @takenori-y, @teinhonglo, @windtoker, @yuekaizhang.

v.0.10.3

2 years ago

New Features

  • [New Features][ESPnet1][RNNT][Installation][README] FastEmit support #3591 by @b-flo
  • [New Features][ESPnet2][ASR] Add ASR portable evaluation script #3569 by @kan-bayashi
  • [New Features][ESPnet2][README] EEND-EDA model for diarization task #3621 by @YushiUeda

Bugfix

  • [Bugfix][ESPnet1] Fix /usr/bin/env bash -e #3651 by @kamo-naoyuki
  • [Bugfix][ESPnet1] ctc loss using dropout layer since .eval() will not work for F.dropout #3539 by @zh794390558
  • [Bugfix][ESPnet2] Minor fix of evaluate_asr.sh #3596 by @kan-bayashi
  • [Bugfix][ESPnet2][ASR] wav2vec2_encoder bug fix #3545 by @simpleoier
  • [Bugfix][ESPnet2][README][SSL] Fix some issues of #3512 and add README.md to librispeech/ssl1 recipe. #3572 by @Jzmo
  • [Bugfix][ESPnet2][TTS] Bug fix the attribute registration in VITS generator #3573 by @kan-bayashi
  • [Bugfix][ESPnet2][TTS] Fix pyopenjtalk_g2p_accent(_with_pause) #3555 by @zzxiang

Recipe

  • [Recipe][ESPnet1][ASR][RNNT] Update Transducer recipes #3465 by @b-flo
  • [Recipe][ESPnet1][ST] Clean libri-trans #3540 by @hirofumi0810
  • [Recipe][ESPnet2][ASR][README] Dan aishell4 branch #3585 by @DanBerrebbi
  • [Recipe][ESPnet2][ASR][README] update pretrained models of librispeech using hubert/wav2vec2 #3568 by @simpleoier
  • [Recipe][ESPnet2][SLU][README] Add slu snips data receipe #3407 by @yuekaizhang
  • [Recipe][ESPnet2][TTS] Update GAN-TTS based configurations #3570 by @kan-bayashi
  • [Recipe][ESPnet2][TTS][README] Add initial VITS results for JSUT #3550 by @kan-bayashi
  • [Recipe][ESPnet2][TTS][README] Add つくよみちゃんコーパス recipe #3552 by @kan-bayashi
  • [Recipe][ESPnet2][TTS][README] IndicSpeech TTS Scripts #3435 by @peter-yh-wu
  • [Recipe][ESPnet2][TTS][README] Update ESPnet2-TTS results #3578 by @kan-bayashi
  • [Recipe][ESPnet2][TTS][README] Update JSUT and JVS results #3553 by @kan-bayashi
  • [Recipe][ESPnet2][TTS][README] Update LJSpeech and CSMSC results #3560 by @kan-bayashi
  • [Recipe][ESPnet2][TTS][README] Update TTS results #3615 by @kan-bayashi
  • [Recipe][ESPnet2][TTS][README] Update TTS results #3648 by @kan-bayashi
  • [Recipe][ESPnet2][TTS][README] Update VCTK results #3581 by @kan-bayashi
  • [Recipe][ESPnet2][TTS][README] Update pret-trained model for TTS recipes #3590 by @ftshijt
  • [Recipe][ESPnet2][TTS][README] update kss recipe with new result. #3589 by @windtoker
  • [Recipe][ESPnet2][TTS][Typo] Fix typo egs2/jtubespeech/tts1 #3564 by @kan-bayashi
  • [Recipe][ESPnet2][TTS][Typo] Update JVS README #3554 by @kan-bayashi

Enhancement

  • [Enhancement][ESPnet2][SE][Refactoring] Add PyTorch Builtin Complex Support in the Speech Enhancement Task #3355 by @Emrys365
  • [Enhancement][ESPnet2][TTS] Hindi g2p #3579 by @peter-yh-wu
  • [Enhancement][ESPnet2][TTS] Unify spks / lids / spk_embed_dim type #3551 by @kan-bayashi
  • [Enhancement][ESPnet2][TTS] Update evaluate_mcd.py script #3566 by @kan-bayashi
  • [Enhancement][ESPnet2][TTS][Installation] Add the installer of tdmelodic pyopenjtalk #3561 by @kan-bayashi
  • [Enhancement][ESPnet2][TTS][Installation][README] Update TTS objective eval scripts #3650 by @kan-bayashi
  • [Enhancement][ESPnet2][TTS][README] Add a new Japanese G2P for TTS #3558 by @kan-bayashi
  • [Enhancement][ESPnet2][TTS][README] Add a new english G2P #3597 by @kan-bayashi

Others

  • [CI] Add codecov config and flags. #3603 by @ShigekiKarita
  • [CI] Omit tools/ from code coverage. #3600 by @ShigekiKarita
  • [CI] Split test_integration.sh #3599 by @ShigekiKarita
  • [CI][ESPnet2][Installation][Refactoring] Make the installation of transformers optional #3622 by @kan-bayashi
  • [CI][Installation] Add no-check-certificate option in PESQ installation #3649 by @kan-bayashi
  • [CI][Installation][README][mergify] Change setup.py for pytorch1.9.1 #3636 by @kamo-naoyuki
  • [Documentation][ESPnet1][RNNT] Fix/improve doc(string)s related to Transducer model #3623 by @b-flo
  • [Documentation][ESPnet2][TTS][README] Update README of ESPnet2-TTS #3546 by @kan-bayashi
  • [Documentation][ESPnet2][TTS][README] Update TTS README #3565 by @kan-bayashi
  • [Documentation][ESPnet2][TTS][README] Update TTS fine-tuning README #3549 by @kan-bayashi
  • [Typo][ESPnet2] Minor bug in format_wav_scp.py #3575 by @ftshijt
  • [Typo][ESPnet2][TTS] update mismatch help info for tts #3602 by @ftshijt

Acknowledgements

Special thanks to @DanBerrebbi, @Emrys365, @Jzmo, @ShigekiKarita, @YushiUeda, @b-flo, @ftshijt, @hirofumi0810, @kamo-naoyuki, @kan-bayashi, @peter-yh-wu, @simpleoier, @windtoker, @yuekaizhang, @zh794390558, @zzxiang.

v.0.10.2

2 years ago

News

  • Hubert training is now available!
    • Try with egs2/librispeech/ssl1
  • GAN-based TTS model is now available!
    • Joint text2mel and vocoder training
    • End-to-end text-to-wave model (VITS) training
    • Try with egs2/ljspeech/tts1
  • Support from_pretrained function!
    # e.g.
    from espnet2.bin.asr_inference import Speech2Text
    asr = Speech2Text.from_pretrained("model_tag")
    
    from espnet2.bin.tts_inference import Text2Speech
    tts = Text2Speech.from_pretrained("model_tag")
    
    from espnet2.bin.enh_inference import SeparateSpeech
    enh = SeparateSpeech.from_pretrained("model_tag")
    
    from espnet2.bin.diar_inference import DiarizeSpeech
    diar = DiarizeSpeech.from_pretrained("model_tag")
    
    Please check the available pretrained models in espnet_model_zoo!

New Features

  • [New Features][ESPnet1] Intermediate CTC + Stochastic depth #3274 by @jaesong
  • [New Features][ESPnet2] Add new trainer for GAN-based training #3436 by @kan-bayashi
  • [New Features][ESPnet2][ASR] Add Hubert model in Espnet2/Refactor from #3458 #3512 by @Jzmo
  • [New Features][ESPnet2][ASR] batch decode with k2 ctc #3433 by @glynpu
  • [New Features][ESPnet2][ASR][SE] Support from_pretrained for ASR and ENH #3535 by @kan-bayashi
  • [New Features][ESPnet2][DIAR] Support from_pretrained for DIAR #3537 by @YushiUeda
  • [New Features][ESPnet2][SE] Adding portable speech enhancement scripts for other tasks #3487 by @Emrys365
  • [New Features][ESPnet2][TTS] Add GAN-TTS task with VITS #3449 by @kan-bayashi
  • [New Features][ESPnet2][TTS] Support SID and LID inputs for TTS models #3490 by @kan-bayashi
  • [New Features][ESPnet2][TTS] Support from_pretrained function in Text2Speech #3532 by @kan-bayashi
  • [New Features][ESPnet2][TTS] Support parallel_wavegan vocoders in tts_inference.py #3513 by @kan-bayashi
  • [New Features][ESPnet2][TTS] Support joint training of text2mel and vocoder #3501 by @kan-bayashi
  • [New Features][ESPnet2][TTS] Support language ID input for espnet2 TTS #3489 by @kan-bayashi
  • [New Features][ESPnet2][TTS] Support speaker id input for TTS models #3452 by @kan-bayashi

Enhancement

  • [Enhancement][ESPnet2][CTC segmentation][README] Fix CTC Segmentation #3500 by @shirayu
  • [Enhancement][ESPnet2][TTS] Add VITS-related modules #3448 by @kan-bayashi
  • [Enhancement][ESPnet2][TTS] Add cython code for VITS #3483 by @kan-bayashi
  • [Enhancement][ESPnet2][TTS] Add joint training config example #3508 by @kan-bayashi
  • [Enhancement][ESPnet2][TTS] Add melgan module for joint training #3516 by @kan-bayashi
  • [Enhancement][ESPnet2][TTS] Add parallel wavegan module for joint training #3515 by @kan-bayashi
  • [Enhancement][ESPnet2][TTS] Add style melgan module for joint training #3517 by @kan-bayashi
  • [Enhancement][ESPnet2][TTS] Add vocoder modules related to VITS #3439 by @kan-bayashi
  • [Enhancement][ESPnet2][TTS] Change Text2Speech class output format #3437 by @kan-bayashi
  • [Enhancement][ESPnet2][TTS] Follow up of the support speaker id input #3453 by @kan-bayashi
  • [Enhancement][ESPnet2][TTS] Support cleaner option in phn converter util #3450 by @kan-bayashi
  • [Enhancement][ESPnet2][TTS] Support language id in VITS #3499 by @kan-bayashi
  • [Enhancement][ESPnet2][TTS] Support linear spectrogram #3438 by @kan-bayashi
  • [Enhancement][ESPnet2][TTS] Support new g2p functions for various languages #3463 by @kan-bayashi
  • [Enhancement][ESPnet2][TTS] Update the TTS inference #3498 by @kan-bayashi
  • [Enhancement][ESPnet2][SLU][README] Add support for intent classification on SLURP dataset #3482 by @siddhu001
  • [Enhancement][ESPnet2][SLU][README] Add NLU post-encoder using Hugging Face Transformers #3410 by @akreal

Recipe

  • [Recipe][ESPnet1][ASR] Mucs21 subtask1 #3376 by @sanket0211
  • [Recipe][ESPnet2][ASR][README] Add Swahili ASR recipe #3485 by @akreal
  • [Recipe][ESPnet2][ASR][README] Rename swahili recipe to iwslt21_low_resource #3522 by @akreal
  • [Recipe][ESPnet2][DIAR][README] Modify ESPnet2 diarization recipe #3524 by @YushiUeda
  • [Recipe][ESPnet2][ESPnet1][ASR] Espnet2 mucs_subtask2 #3415 by @bloodraven66
  • [Recipe][ESPnet2][ESPnet1][ASR] mucs subtask1 #3417 by @bloodraven66
  • [Recipe][ESPnet2][SE] Add Voicebank (vctk_noisy) script #3486 by @neillu23
  • [Recipe][ESPnet2][TTS] Add missing configs for LibriTTS recipe #3455 by @kan-bayashi
  • [Recipe][ESPnet2][TTS] Update VITS config comments and settings #3528 by @kan-bayashi
  • [Recipe][ESPnet2][TTS] aishell3 dataset preparation #3505 by @actboy
  • [Recipe][ESPnet2][TTS][README] Add CSS10 recipe for ESPnet2-TTS #3464 by @kan-bayashi
  • [Recipe][ESPnet2][TTS][README] Add JtubeSpeech Recipe #3459 by @Takaaki-Saeki
  • [Recipe][ESPnet2][TTS][README] Add SIWIS recipe #3460 by @takenori-y
  • [Recipe][ESPnet2][TTS][README] TTS recipe for J-KAC corpus #3468 by @TanUkkii007
  • [Recipe][ESPnet2][TTS][README] TTS recipes for thchs30 and aishell3 #3470 by @ftshijt
  • [Recipe][ESPnet2][TTS][README] Update JMD README #3531 by @takenori-y
  • [Recipe][ESPnet2][TTS][README] Update SIWIS README #3509 by @takenori-y
  • [Recipe][ESPnet2][SLU][README] Predict ASR transcript along with Intent for SLU #3480 by @siddhu001
  • [Recipe][ESPnet2][SLU][README] Update SWBD DA configuration #3425 by @akreal

Bugfix

  • [Bugfix][ESPnet2] Add return_complex=False for stft #3476 by @D-X-Y
  • [Bugfix][ESPnet2] Dynamic import for the ngram function #3420 by @ftshijt
  • [Bugfix][ESPnet2][README][Recipe] Add the GigaSpeech normalization and fix the WER #3519 by @chaisz19
  • [Bugfix][ESPnet2][TTS] Add duration and focus_rate in output dict #3469 by @kan-bayashi
  • [Bugfix][ESPnet2][TTS] Add missing symlink to trim_silence.py for ESPnet2 #3467 by @kan-bayashi
  • [Bugfix][ESPnet2][TTS] Fix wrong arguments in pretrained vococder wrapper #3525 by @kan-bayashi
  • [Bugfix][ESPnet2][TTS] Revert wrongly removed lines in tts.sh #3503 by @kan-bayashi
  • [Bugfix][ESPnet2][TTS][Typo] Fix typo in hifigan #3504 by @kan-bayashi

Refactoring

  • [Refactoring][ESPnet1][ASR][RNNT][README] Transducer v5 #3217 by @b-flo
  • [Refactoring][ESPnet2][SE][DIAR] Remove prefix enh_ and diar_ #3538 by @kan-bayashi
  • [Refactoring][ESPnet2][TTS] Refactor TTS modules in ESPnet2 #3497 by @kan-bayashi
  • [Refactoring][ESPnet2][TTS] Remove the support of feats_type=fbank/stft in ESPnet2-TTS #3514 by @kan-bayashi

Others

  • [CI] Fix k2 version in CI using conda #3493 by @kan-bayashi
  • [CI] Fix test condition #3527 by @kan-bayashi
  • [CI][Installation] Update Sentencepiece and add python 3.9 to CI #3422 by @shirayu
  • [Docker] Docker Updates #3393 by @Fhrozen
  • [Documentation] Update the tutorial about maxlenratio usage #3523 by @akreal
  • [Documentation][ESPnet2][TTS] Update README.md #3502 by @kan-bayashi
  • [Installation][README] Added a link and a classifier for Python 3.9 #3440 by @shirayu
  • [Typo] Fix typos in "egs" #3447 by @shirayu
  • [Typo][Documentation] Fix typos in "doc" #3441 by @shirayu
  • [Typo][Documentation] Fix typos in "utils" #3442 by @shirayu
  • [Typo][ESPnet1][MT] Fix typos in "espnet" #3444 by @shirayu
  • [Typo][ESPnet2] Fix typos in "espnet2" #3443 by @shirayu
  • [Typo][ESPnet2][README] Fix typos in "egs2" #3445 by @shirayu

Acknowledgements

Special thanks to @D-X-Y, @Emrys365, @Fhrozen, @Jzmo, @Takaaki-Saeki, @TanUkkii007, @YushiUeda, @actboy, @akreal, @b-flo, @bloodraven66, @chaisz19, @ftshijt, @glynpu, @jaesong, @kan-bayashi, @neillu23, @sanket0211, @shirayu, @siddhu001, @takenori-y.

v.0.10.1

2 years ago

New Features

  • [New Features][ESPnet2] Porting existing pre-trained models to hugging face #3321 by @siddhu001
  • [New Features][ESPnet2][ASR][CI][Installation] k2_and_espnet2 #3358 by @glynpu
  • [New Features][ESPnet2][ASR][LM][CI] espnet2 ngram #3345 by @qmpzzpmq
  • [New Features][ESPnet2][Installation] add s3prl frontend #3187 by @simpleoier

Recipe

  • [Recipe][ESPnet1][ASR] Fix the iconv error in hkust data prep #3397 by @sw005320
  • [Recipe][ESPnet1][ASR] mucs subtask2 baseline recipes (e2e and kaldi) #3362 by @bloodraven66
  • [Recipe][ESPnet1][ESPnet2][ASR] JTubeSpeech recipe and hkust espnet1 #3406 by @sw005320
  • [Recipe][ESPnet1][TTS] CMU INDIC TTS #3347 by @peter-yh-wu
  • [Recipe][ESPnet2][ASR] ESPnet2 Recipe for Ksponspeech #3387 by @YushiUeda
  • [Recipe][ESPnet2][ASR] Fix gigaspeech pre-trained model link #3317 by @sw005320
  • [Recipe][ESPnet2][ASR] LRS2 lipreading recipe #3346 by @LiChenda
  • [Recipe][ESPnet2][ASR] OpenSLR Sundanese ASR #3344 by @peter-yh-wu
  • [Recipe][ESPnet2][ASR] Recipe of JTubeSpeech #3311 by @sw005320
  • [Recipe][ESPnet2][ASR] fix path error in local/score.sh in swbd #3349 by @wonkyuml
  • [Recipe][ESPnet2][ASR] updated javanese and sundanese readmes #3369 by @peter-yh-wu
  • [Recipe][ESPnet2][ASR][Installation] OpenSLR Javanese ASR #2960 by @peter-yh-wu
  • [Recipe][ESPnet2][SLU] Add initial Switchboard Dialogue Act classification recipe #3395 by @akreal
  • [Recipe][ESPnet2][SLU] FSC Espnet2 data preparation #3352 by @siddhu001
  • [Recipe][ESPnet2][TTS] Add HUI-audio-corpus-german recipe for ESPnet2-TTS #3375 by @kan-bayashi
  • [Recipe][ESPnet2][TTS] Add JMD recipe #3394 by @takenori-y
  • [Recipe][ESPnet2][TTS] Add RUSLAN recipe for ESPnet2-TTS #3378 by @kan-bayashi
  • [Recipe][ESPnet2][TTS] Support KSS dataset recipe for ESPnet2-TTS #3383 by @kan-bayashi
  • [Recipe][ESPnet2][TTS] Update HUI audio corpus german recipe #3381 by @kan-bayashi
  • [Recipe][ESPnet2][TTS] Update HUI-audio-corpus-german recipe results of ESPnet2-TTS #3391 by @kan-bayashi
  • [Recipe][ESPnet2][TTS] Update KSS dataset recipe results of ESPnet2-TTS #3400 by @kan-bayashi
  • [Recipe][ESPnet2][TTS] Update RUSLAN recipe results of ESPnet2-TTS #3390 by @kan-bayashi
  • [Recipe][ESPnet2][TTS] indic tts without pretrained model #3401 by @peter-yh-wu

Enhancement

  • [Enhancement][ESPnet2] Update wav2vec2_encoder.py #3312 by @brotheroak
  • [Enhancement][ESPnet2][TTS] Add trim_silence for ESPnet2-TTS #3380 by @kan-bayashi
  • [Enhancement][ESPnet2][TTS] Allow override default 'speed_control_alpha' parameter #3316 by @airenas
  • [Enhancement][ESPnet2][TTS] Support French G2P #3372 by @kan-bayashi
  • [Enhancement][ESPnet2][TTS] Support German G2P #3371 by @kan-bayashi
  • [Enhancement][ESPnet2][TTS] Support Korean G2P #3382 by @kan-bayashi
  • [Enhancement][ESPnet2][TTS] Support Russian G2P #3377 by @kan-bayashi
  • [Enhancement][ESPnet2][TTS] Support Spanish G2P #3373 by @kan-bayashi
  • [Enhancement][ESPnet2][TTS] Update README about G2P #3374 by @kan-bayashi

Bugfix

  • [Bugfix][ESPnet1][ESPnet2] Fix a type error of swbd data preparation. #3324 by @pengchengguo
  • [Bugfix][ESPnet1][ESPnet2][TTS] Fixed label modification in Taco2 or Transformer-TTS with R > 1 #3392 by @kan-bayashi
  • [Bugfix][ESPnet2] fix a bug in OneCycleLR and CyclicLR #3319 by @sw005320

Others

  • [Typo][ESPnet1] Update batch_beam_search_online_sim.py #3367 by @aky15
  • [Typo][ESPnet2] Fixed typo in model name #3364 by @kan-bayashi
  • [Typo][ESPnet2] Update contextual_block_transformer_encoder.py #3354 by @aky15

Acknowledgements

Special thanks to @LiChenda, @YushiUeda, @airenas, @akreal, @aky15, @bloodraven66, @brotheroak, @glynpu, @kan-bayashi, @pengchengguo, @peter-yh-wu, @qmpzzpmq, @siddhu001, @simpleoier, @sw005320, @takenori-y, @wonkyuml.

v.0.10.0

2 years ago

From v.0.10.x, we drop the support pytorch < 1.3.
See more info in https://github.com/espnet/espnet/issues/3300

New Features and Enhancement

  • [New Features][ESPnet1][ASR][CI] Dynamic quantization for decoding #3210 by @xu-gaopeng
  • [New Features][ESPnet1] Add quantize args #3280 by @xu-gaopeng
  • [Enhancement][ESPnet2][README] Update W&B integration #3278 by @AyushExel
  • [Enhancement][ESPnet2][README] Change the default value of use_wandb to False #3287 by @kamo-naoyuki

Bugfix

  • [Bugfix][ESPnet1] Fix some bugs in xml2stm.py #3252 by @AshrafMahdhi
  • [Bugfix][ESPnet1][Recipe] fix the required number of arguments #3249 by @AshrafMahdhi
  • [Bugfix][ESPnet2] Bug fix of accum_grad when grad-nan #3283 by @kamo-naoyuki
  • [Bugfix][ESPnet2] Fix #3255 #3257 by @tjysdsg
  • [Bugfix][ESPnet2] Fix bug when "--field -5" is passed to espnet2.bin.tokenize_text #3262 by @tjysdsg
  • [Bugfix][ESPnet2] Fix typo in asr.sh (espnet2) that might cause bug #3264 by @tjysdsg
  • [Bugfix][ESPnet2] Warn ignore_nan_grad with warpctc instead of error. #3298 by @ShigekiKarita
  • [Bugfix][ESPnet2][TTS] Fix a bug in the TTS transformer initialization #3251 by @sw005320

Recipe

  • [Recipe][ESPnet1][ST] Minor fix of Fisher-Callhome recipe #3305 by @hirofumi0810
  • [Recipe][ESPnet2][ASR] ESPnet2 Receipe for swbd #3269 by @yuekaizhang
  • [Recipe][ESPnet2][ASR][README] SWBD Result Update #3308 by @roshansh-cmu
  • [Recipe][ESPnet2][SE] Add scripts for DNS Interspeech 2020 in ESPNet-se #3259 by @neillu23
  • [Recipe][ESPnet2][SE][README] Pretrained model for vctk noisy reverberant recipe #3273 by @LiChenda
  • [Recipe][ESPnet2][SE][README] dns_ins20: Add README.md and real_recording testing data. #3281 by @neillu23

Refactoring

  • [Refactoring][ESPnet2][ASR] Update ctc.py #3292 by @200987299
  • [Refactoring][ESPnet1][ASR][MT][CI][README] Delete old pytorch dispatch in espnet1 #3301 by @ShigekiKarita
  • [Refactoring][CI][Documentation][Installation][README] Remove travis and add .github/workflows/doc.yml to deploy doc #3294 by @ShigekiKarita
  • [Refactoring][CI][Installation][README] Add pytorch 1.9.0 support and remove 1.0.1, 1.1.0, and 1.2.0 #3299 by @ShigekiKarita

Others

  • [Documentation][ESPnet2] Add a comment for disabling the attention plot #3258 by @sw005320
  • [ESPnet2][Installation][mergify] Follow up for #3299, about pytorch1.9.0 in ci #3310 by @kamo-naoyuki

Acknowledgements

Special thanks to @200987299, @AshrafMahdhi, @AyushExel, @LiChenda, @ShigekiKarita, @hirofumi0810, @kamo-naoyuki, @neillu23, @roshansh-cmu, @sw005320, @tjysdsg, @xu-gaopeng, @yuekaizhang.

v.0.9.10

2 years ago

New Features

  • [New Features][ESPnet1][ESPnet2][Installation][README] CTC Segmentation for ESPnet 2 #3087 by @lumaku

Bugfix

  • [Bugfix][ESPnet1] Fix merge_short_segments.py #3171 by @hirofumi0810
  • [Bugfix][ESPnet1] update layer norm to reflect the dimension variable #3193 by @sw005320
  • [Bugfix][ESPnet1][ASR] Fix a bug about variable spelling errors #3208 by @lzm0706
  • [Bugfix][ESPnet1][ST] Fix ST-TED data preparation #3167 by @hirofumi0810
  • [Bugfix][ESPnet2] Fix a bug of adding noise to the training data. #3220 by @pengchengguo
  • [Bugfix][ESPnet2] fix a bug in the CTC mode #3190 by @sw005320
  • [Bugfix][ESPnet2] fix typo for AdapterForSoundScpReader #3096 by @deciding
  • [Bugfix][ESPnet2] remove find_unused_parameters from DataParallel #3149 by @kamo-naoyuki
  • [Bugfix][ESPnet2][ASR] Changed to include nlsyms.txt in the pretrained model #3236 by @kamo-naoyuki
  • [Bugfix][ESPnet2][ASR] Fix missing nlsyms.txt for pretrained models #3234 by @lumaku
  • [Bugfix][ESPnet2][ASR] Workaround for missing nlsyms.txt #3235 by @kamo-naoyuki
  • [Bugfix][ESPnet1][ASR][Installation] GTN CTC bug fix, unit test, and installer #3199 by @brianyan918
  • [Bugfix][ESPnet2][README] Update README.md, edit wrong file link. #3164 by @xxjjvxb

Enhancement

  • [Enhancement] Added "trans_type" to utils/remove_longshortdata.sh and utils/update_json.sh #3148 by @teinhonglo
  • [Enhancement][ESPnet2][SE][README] Update the readme file for the SE demo page. #3225 by @LiChenda
  • [Enhancement][ESPnet2][ASR][README] update asr demo #3192 by @ftshijt

Recipe

  • [Recipe][ESPnet1][ASR] Fix segmentation in IWSLT21 ASR #3169 by @hirofumi0810
  • [Recipe][ESPnet1][ASR] Fix tokenization on TEDLIUM2 in IWSLT21 ASR recipe #3142 by @hirofumi0810
  • [Recipe][ESPnet1][ASR] fix add_to_datadir.py in mgb2 recipe #3238 by @AshrafMahdhi
  • [Recipe][ESPnet1][ASR] fix receipe bug for swbd #3174 by @yuekaizhang
  • [Recipe][ESPnet1][ASR][RNNT] Transducer configs & results for AISHELL-1 #3240 by @yusshino
  • [Recipe][ESPnet1][ASR][ST] Fix IWSLT21 recipe for test set evaluation #3155 by @hirofumi0810
  • [Recipe][ESPnet1][ESPnet2][README] endangered language recognition espnet2 recipe #3214 by @ftshijt
  • [Recipe][ESPnet1][MT] Add IWSLT21 MT recipe #3140 by @hirofumi0810
  • [Recipe][ESPnet1][ST] Add IWSLT21 ST recipe #3150 by @hirofumi0810
  • [Recipe][ESPnet1][ST] Fix IWSLT evaluation data preparation #3168 by @hirofumi0810
  • [Recipe][ESPnet1][ST] IWSLT21 punctuation restoration recipe #3145 by @hirofumi0810
  • [Recipe][ESPnet1][ST] Merge short segments in IWSLT test sets #3162 by @hirofumi0810
  • [Recipe][ESPnet1][TTS] Fix misspelling in ./egs/jsut/tts1/local/download.sh #3227 by @muramasa2
  • [Recipe][ESPnet2][ASR] Normalization for Open_li52 #3215 by @ftshijt
  • [Recipe][ESPnet2][SE] ESPnet-SE Recipe for noisy reverberant dataset #3243 by @LiChenda
  • [Recipe][ESPnet2][SE][README] Update recipes for speech enhancement task #3153 by @LiChenda

Acknowledgements

Special thanks to @AshrafMahdhi, @LiChenda, @brianyan918, @deciding, @ftshijt, @hirofumi0810, @kamo-naoyuki, @lumaku, @lzm0706, @muramasa2, @pengchengguo, @sw005320, @teinhonglo, @xxjjvxb, @yuekaizhang, @yusshino.

v.0.9.9

3 years ago

New Features

  • [New Features][ESPnet2] Speaker diarization implementation in ESPnet #2939 by @ftshijt
  • [New Features][ESPnet2] Adding gpu_max_cached_mem_GB in reporter's stats #3057 by @kamo-naoyuki
  • [New Features][ESPnet2] add --detect_anomaly option #3035 by @kamo-naoyuki
  • [New Features][ESPnet2][SE] Further update to speech enhancement task #2929 by @shincling

Bugfix

  • [Bugfix][ESPnet1] Fix a typo in the aishell config #3089 by @sw005320
  • [Bugfix][ESPnet1] Fix utils/speed_perturb.sh #3062 by @hirofumi0810
  • [Bugfix][ESPnet1] fix #3017 #3022 by @kamo-naoyuki
  • [Bugfix][ESPnet1][RNNT] Fix+update RNN encoder #3048 by @b-flo
  • [Bugfix][ESPnet1][RNNT] Minor fix for NSC #3030 by @b-flo
  • [Bugfix][ESPnet2] Fix #3072 #3073 by @kamo-naoyuki
  • [Bugfix][ESPnet2] Fix ESPnet2-TTS conformer backward compatibility #3108 by @kan-bayashi
  • [Bugfix][ESPnet2] Fix a bug when use_amp=True without fairscale #3029 by @kamo-naoyuki
  • [Bugfix][ESPnet2] Fix logging for pytorch>=1.8 #3056 by @kamo-naoyuki
  • [Bugfix][ESPnet2] Fixed backward compatibility issue of new conformer definition #3068 by @hfujihara
  • [Bugfix][Installation] Fix a bug of uninstalling typing #3058 by @kamo-naoyuki
  • [Bugfix][Installation] Fix setup.py to install filelock #3074 by @kamo-naoyuki
  • [Bugfix][Installation] fix the condition to install fairscale #3050 by @kamo-naoyuki
  • [Bugfix][Recipe][ESPnet1] Typo fixed for nahuatl recipe #3044 by @ftshijt
  • [Bugfix][Recipe][ESPnet1][ASR] Bugfix for download_and_untar for nahuatl #3049 by @ftshijt
  • [Bugfix][Recipe][ESPnet1][ESPnet2][TTS] Fix CSMSC download script #3109 by @kan-bayashi
  • [Bugfix][Recipe][ESPnet2][TTS][README] fixed typo #3121 #3123 by @kan-bayashi

Enhancement

  • [Enhancement][ASR][ESPnet1][RNNT] Update loss report #3110 by @b-flo
  • [Enhancement][ESPnet1][RNNT] Fix related to custom encoder and aux task #3045 by @b-flo
  • [Enhancement][ESPnet2][Documentation][Installation][README] modification of freezing option for Wav2Vec encoder, add documents #3036 by @simpleoier

Recipe

  • [Recipe][ESPnet1][ASR] added results and uploaded models #3063 by @sw005320
  • [Recipe][ESPnet1][ASR][ST] fix download for puebla-nahuatl #3039 by @ftshijt
  • [Recipe][ESPnet1][MT] Update IWSLT18 MT recipe #3071 by @hirofumi0810
  • [Recipe][ESPnet1][ST] IWSLT21-low-resource recipe #3023 by @ftshijt
  • [Recipe][ESPnet1][ST] Nahuatl Speech Translation #3034 by @ftshijt
  • [Recipe][ESPnet2][ASR][README] Added spgispeech recipe in espnet2 #2986 by @sw005320
  • [Recipe][ESPnet2][ASR][README] Update librispeech result #3082 by @kamo-naoyuki
  • [Recipe][ESPnet2][ASR][README] Updated ami ihm result #3091 by @kamo-naoyuki
  • [Recipe][ESPnet2][ASR][README] added a bpe10000 model and result #3060 by @sw005320
  • [Recipe][ESPnet2][ASR][README] gigaspeech #3077 by @sw005320

Refactoring

  • [Refactoring][ESPnet1] Refactor layer selection in Transformer #3024 by @hirofumi0810
  • [Refactoring][ESPnet1][MT][ST] Unify divide_lang.sh #3066 by @hirofumi0810
  • [Refactoring][ESPnet2] Make batch bins sampler faster #3106 by @kamo-naoyuki
  • [Refactoring][Installation] Use new pyopenjtalk version #3107 by @kan-bayashi
  • [Refactoring][ESPnet1][ESPnet2][Installation][Docker][Documentation] Change '#!/bin/bash' to '#!/usr/bin/env bash' #3059 by @kamo-naoyuki

Other

  • [CI][Installation][README][mergify] Using torch=1.8.1 in ci tests #3122 by @kamo-naoyuki
  • [CI][Installation][README][mergify] Adding pytorch=1.8.0 to the ci #3046 by @kamo-naoyuki

Acknowledgements

Special thanks to @b-flo, @ftshijt, @hfujihara, @hirofumi0810, @kamo-naoyuki, @kan-bayashi, @shincling, @simpleoier, @sw005320.

v.0.9.8

3 years ago

New Features

  • [New Features][ESPnet1][ASR][RNNT] Auxiliary task #2951 by @b-flo
  • [New Features][ESPnet1][Recipe] RTF calculation #2942 by @hirofumi0810
  • [New Features][ESPnet2] Supporting multiple optimizers in the default trainer #3014 by @kamo-naoyuki
  • [New Features][ESPnet2][ASR] Streaming Transformer ASR #2907 by @eml914
  • [New Features][ESPnet2][ASR][Installation] add wav2vec_encoder #2889 by @simpleoier
  • [New Features][ESPnet2][Documentation][Installation][README] Support sharded training of fairscale #2980 by @kamo-naoyuki
  • [New Features][ESPnet2][SE] Add SeparateSpeech API in espnet2/bin/enh_inference.py #2878 by @Emrys365
  • [New Features][ESPnet2][TTS][Installation][README] Support phonemizer for vairous language G2P #2959 by @kan-bayashi

Bugfix

  • [Bugfix][CI][Installation] Install warp-ctc using pip>=21.0 #2999 by @ysk24ok
  • [Bugfix][ESPnet1] Integration testing for asr_mix was using the wrong config. #3006 by @siddalmia
  • [Bugfix][ESPnet1][ASR] Fix model averaging #2910 by @b-flo
  • [Bugfix][ESPnet1][ASR] bug fixed for streaming transformer ASR #2981 by @eml914
  • [Bugfix][ESPnet1][ASR] builtin ctc modification #3001 by @siddalmia
  • [Bugfix][ESPnet1][ASR][CI] Fix transfer learning w/ pre-trained LM + finetuning tutorial #2967 by @b-flo
  • [Bugfix][ESPnet1][ASR][RNNT] Fix a condition in TSD #2965 by @b-flo
  • [Bugfix][ESPnet1][ASR][Recipe] fix egs/ljspeech/asr1 #2865 #2884 by @kan-bayashi
  • [Bugfix][ESPnet1][ASR][Recipe][ST] Fix bug in How2 recipe #2933 by @hirofumi0810
  • [Bugfix][ESPnet1][ASR][Refactoring] Fix data sorting in attention/CTC visualization #2883 by @hirofumi0810
  • [Bugfix][ESPnet1][Docker] Fix docker error caused by BeamSearchTransducer #2973 by @b-flo
  • [Bugfix][ESPnet1][ESPnet2] Fix bugs of our Conformer implementation. #2816 by @pengchengguo
  • [Bugfix][ESPnet1][ESPnet2][Refactoring] Fix arguments in dynamic and lightweight conv #3004 by @hirofumi0810
  • [Bugfix][ESPnet1][RNNT] fix out_dim definition #2915 by @b-flo
  • [Bugfix][ESPnet1][TTS] Fix attention plot bug #2984 #2985 by @kan-bayashi
  • [Bugfix][ESPnet1][mergify] swbd run.sh is including dev data in the training set #2977 by @brianyan918
  • [Bugfix][ESPnet2] Fix sharded_ddp mode #3015 by @kamo-naoyuki
  • [Bugfix][ESPnet2] bug fix for Wav2Vec encoder #2997 by @simpleoier
  • [Bugfix][ESPnet2][Documentation] Fix for sharded training with amp #2993 by @kamo-naoyuki
  • [Bugfix][ESPnet2][Documentation] Fix sharded training for multiple nodes #2994 by @kamo-naoyuki
  • [Bugfix][ESPnet2][SE] quick fix for librimix (SE) data preparation #2982 by @LiChenda

Recipe

  • [Recipe][ESPnet1][ASR] Fix dev set in IWSLT21 ASR recipe #3000 by @hirofumi0810
  • [Recipe][ESPnet1][ASR] IWSLT'21 ASR recipe #2934 by @hirofumi0810
  • [Recipe][ESPnet1][ASR] Update IWSLT21 ASR recipe #2987 by @hirofumi0810
  • [Recipe][ESPnet1][ASR] Update the pre-trained Conformer model link of Aishell-1 corpus. #2924 by @pengchengguo
  • [Recipe][ESPnet1][ASR] Update transformer training results on common vioce dataset #2927 by @wenjie-p
  • [Recipe][ESPnet1][ASR][CI][Installation][Refactoring] Update IWSLT18 (ST-TED) ASR recipe #2916 by @hirofumi0810
  • [Recipe][ESPnet1][ASR][MT][ST][README] Must-C v2 recipe #2963 by @hirofumi0810
  • [Recipe][ESPnet1][ASR][MT][ST][Refactoring] Refactor Fisher-CallHome recipe #2904 by @hirofumi0810
  • [Recipe][ESPnet1][ASR][MT][ST][Refactoring] Refactor How2 recipe #2906 by @hirofumi0810
  • [Recipe][ESPnet1][ASR][MT][ST][Refactoring] Refactor Must-C recipe #2901 by @hirofumi0810
  • [Recipe][ESPnet1][ASR][MT][ST][Refactoring] Refactor libri-trans recipe #2903 by @hirofumi0810
  • [Recipe][ESPnet1][ASR][ST][Refactoring] Update IWSLT'19 recipe #2940 by @hirofumi0810
  • [Recipe][ESPnet1][ST][CI][Refactoring] Refactor ST recipes #2975 by @hirofumi0810
  • [Recipe][ESPnet1][ST][Refactoring] Refactor Mboshi-French corpus #2911 by @hirofumi0810
  • [Recipe][ESPnet2][ASR] Open-li52(add language id scoring & text case align for test set) #2938 by @ftshijt
  • [Recipe][ESPnet2][ASR][README] Add Russian open STT recipe for ESPnet2 #2972 by @akreal
  • [Recipe][ESPnet2][ASR][README] MLS (multi-lingual librispeech) recipe #2869 by @ftshijt
  • [Recipe][ESPnet2][ASR][README] Update espnet2 librispeech result #2966 by @kamo-naoyuki
  • [Recipe][ESPnet2][ASR][README] added nsc results #2937 by @sw005320
  • [Recipe][ESPnet2][ASR][README] fix librispeech model url #2976 by @kamo-naoyuki
  • [Recipe][ESPnet2][ASR][README] minor fix of li52 and nsc recipes #2936 by @sw005320
  • [Recipe][ESPnet2][ASR][README] update the results of open li52 recipe #2974 by @sw005320
  • [Recipe][ESPnet2][SE] Librimix separation results for Conv-Tasnet, 8k, min #2928 by @anogkongda
  • [Recipe][ESPnet2][SE][README] Espnet-SE, Speech enhancement recipes #2888 by @LiChenda

Enhancement

  • [Enhancement][ESPnet1][ASR] Auto Resampling to 16khz for pretrained models #2969 by @siddalmia
  • [Enhancement][ESPnet1][ASR][RNNT] Minor refactoring #2932 by @b-flo
  • [Enhancement][ESPnet1][ASR][RNNT][README][CI][Documentation] Refactoring RNNT #2887 by @b-flo
  • [Enhancement][ESPnet1][ESPnet2][ASR][LM][MT][TTS] Print total params and trainable params. #2996 by @siddalmia
  • [Enhancement][ESPnet1][LM] Add LM options like embedding dropout and tie weights #3010 by @siddalmia
  • [Enhancement][ESPnet1][ST][Refactoring] Add the latest RPE implementation to the ST task. #3005 by @pengchengguo

Other

  • [CI][README][mergify] Stop circle ci #2978 by @kamo-naoyuki
  • [Documentation] Update docs for ESPnet contributing (especially for recipes part) #2905 by @ftshijt
  • [Documentation] fix a typo #3016 by @Huang17
  • [Installation] Uninstall typing #2979 by @kamo-naoyuki

Acknowledgements

Special thanks to @Emrys365, @Huang17, @LiChenda, @akreal, @anogkongda, @b-flo, @brianyan918, @eml914, @ftshijt, @hirofumi0810, @kamo-naoyuki, @kan-bayashi, @pengchengguo, @siddalmia, @simpleoier, @sw005320, @wenjie-p, @ysk24ok.