:robot: The free, Open Source OpenAI alternative. Self-hosted, community...
Amphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generatio...
Text-to-Audio/Music Generation
Audio generation using diffusion models, in PyTorch.
TTS Generation Web UI (Bark, MusicGen + AudioGen, Tortoise, RVC, Vocos, ...
A family of diffusion models for text-to-audio generation.
[CVPR'23] MM-Diffusion: Learning Multi-Modal Diffusion Models for Joint ...
Source code for "Taming Visually Guided Sound Generation" (Oral at the B...
Pytorch implementation of BigVSAN
Official pytorch implementation of the paper: "Catch-A-Waveform: Learnin...
Reading list for research topics in Sound AI
Trainer for audio-diffusion-pytorch
Word2Wave: a framework for generating short audio samples from a text pr...
FunCodec is a research-oriented toolkit for audio quantization and downs...
Implementation of SoundStorm, Efficient Parallel Audio Generation from G...