Efficient Training of Audio Transformers with Patchout
Added a pre-trained model, trained with KD using the setup in https://github.com/fschmid56/EfficientAT
Pre-trained PaSST-U and PaSST-B on Audioset
fsd50k-passt-s-n-f128-p16-s16-ap.642.pt
pre-trained on FSD50K with structured patchout and no overlap map=0.642.fsd50k-passt-s-f128-p16-s10-ap.655.pt
pre-trained on FSD50K with structured patchout map=0.655.openmic-passt-s-f128-10sec-p16-s10-ap.85.pt
pre-trained on OpenMIC-2008 with structured patchout map=0.85.passt-s-f128-30sec-p16-s10-ap.473-swa.pt
pre-trained on Audioset but supports inference up to 30-seconds.passt-s-f128-20sec-p16-s10-ap.474-swa.pt
pre-trained on Audioset but supports inference up to 20-seconds.Pre-trained models with a smaller STFT hop
Added more pretrained models