SpleeterRT Versions Save

Real time monaural source separation base on fully convolutional neural network operates on Time-frequency domain.

v0.2-alpha

3 years ago

coefficients.7z contains 4 neural networks model files, they are for 2stems and 4stems VST DLL to read. For 4stems VST: The quality of _256 is much higher than _128 and the separation is much stable, however, the latency of 128 is half of the 256. For 2stems VST: The quality is somehow similar for _128 and _256, however, _64 offer minimum latency, quality is not bad at all, this version is highly optimized and close source.

Windows Installation:

Extract all files inside coefficients.7z to C:\Users\YOUR PROFILE->current step is only required for 4stems
Extract Spleeter_Win.7z to whatever your VST hosts search path is.
Load the VST .dll in any VST hosts.

Tested host: Adobe Audition, Audacity(x86), foobar2000 VST adapter.

Caution, x86 4stems VST are likely to get OOM and crashed, that's why I will not support VST3 4stems, VST3 usually cost more memory.

Mac Installation:

Install Intel MKL (Important, I suspect the .vst was dynamic linked)
Extract all files inside coefficients.7z to /Users/YOUR HOME DIRECTORY
Extract Spleeter_Mac_x64.zip to /Users/YOUR HOME DIRECTORY/Library/Audio/Plug-Ins/VST/ is.
Load the VST .dll in any VST hosts.

Tested host: Adobe Audition(x64), Audacity(x64).

Intrinsic latency of algorithm with STFT scheme: 256 -> ((F / Lap) * T * BufFactor) / Fs -> ((4096 / 4) * 256 * 2) / 44100 -> 11.8886 secs 128 -> ((F / Lap) * T * BufFactor) / Fs -> ((4096 / 4) * 128 * 2) / 44100 -> 5.9443 secs 64 -> ((F / Lap) * T * BufFactor) / Fs -> ((4096 / 4) * 64 * 2) / 44100 -> 2.9721 secs

Opinion:

2.9721 secs latency would be the lowest latency deep learning model-based monaural source separation algorithm I've ever seen! Since Spleeter is STFT image segmentation network, we don't have to deal with time frame boundary effect like Demucs and there is very high chance to get nonlinear distortion from output like Wave-U-Net and Demucs. Spleeter may have boundary effect, but not in the way like time domain "adding" impulse to your signal.

Above instructions was design for online(VST) version, for the offline version(SpleeterRT_windows_offline.7z) You just need to extract the .exe file on elsewhere and use CLI to playaround.

v0.1-alpha

3 years ago

.7z package contains 2 DLL file, Spleeter4Stems_128.dll and Spleeter4Stems_256.dll are the same in functionality. The quality of _256 is much higher than _128 and the separation is much stable, however, the latency of 128 is half of the 256.

Installation:

Extract .7z
Copy accompaniment4stems.dat, bass4stems.dat, drum4stems.dat, vocal4stems.dat to C:\Users\YOUR PROFILE
Load the VST .dll in any VST hosts.

Tested host: Adobe Audition, Audacity, foobar2000 VST adapter.