Hyperion Ml Hyperion Versions Save

Python toolkit for speech processing

2 years ago

2 years ago

2 years ago

Breaking Changes:
- s parameter in x-vector AAM/AM softmax loss is renamed as cos_scale to avoid conflict with other arguments that start with s. Models trained with older versions still should work.
- Launcher for multigpu has been changed to the newer torchrun, it implies the multigpu won't work on PyTorch <1.9.
Hyperion updated to work with PyTorch >=1.9, PyTorch 1.10 recommended.
Documentation automatically uploaded to https://hyperion-ml.readthedocs.io

2 years ago

2 years ago

2 years ago

Changes in requirements, current version only works with
- Pytorch >=1.6 <= 1.8.0
- Fairscale = 0.3.8

2 years ago

Adds ECAPA-TDNN
New Recipes:
- SRE21-AV Audio: 16k and 8k versions
- SRE21-AV Visual: face recognition recipes using Pre-trained MX-Net and Pytorch ArcFace embeddings
- SRE21-AV: fusion of audio and visual modalities at score levels

2 years ago

Make Hyperion pip installable
Adds installation instructions
Configuration files and command-line arguments are handled using jsonarparse, not argparse anymore. This allows us to use yaml files and overrides the values in the yaml file from the command line.
First version using nn.DistributedDataParallel instead of nn.DataParallel
Supports FairScale Sharded DataParallel, we didn't observe significant memory gains in our models using this so far
Added SpineNet, Spine2Net and TSE-Spine2Net x-vector architectures from our IS21 paper
Added SpeAugment PyTorch Layer
Added numpy speed augment class
Fixed make_voxceleb2cat.pl, all speakers extracted from the same video were getting the same spkid, the script did not take into account that more than one speaker could be extracted from each video
New recipes:
- Recipe for classifying adversarial attacks algorithms and threat models from our IS21 paper (voxceleb/adv.v2)
- Recipe for adv attacks aginst spk verif renamed as voxceleb/adv.v1 and adv.v1.1, these recipes have been updated and cleaned up
- Recipe for SRE19-AV Audio part with AHC diarization (sre19-av-a/v2.1)
- Recipe for Chime5 speaker verification setup chime5/v1
- Recipes for SRE19-AV Face Recognition using pretrained RetinaFace face detector and ArcFace embeddings from InsightFace MX-Net repository (sre19-av-v/v0.1) and Insightface-Pytorch (sre19-av-v/v0.2)
- Added VOiCES challenge recipe
- Adds SRE20-CTS recipe v1
- Added Spine2Net results in voxceleb/v1.1 recipe

2 years ago

2 years ago