Python Sound Tool Versions Save

SoundPy (alpha stage) is a research-based python package for speech and sound. Applications include deep-learning, filtering, speech-enhancement, audio augmentation, feature extraction and visualization, dataset and audio file conversion, and beyond.

v0.1.0a2

3 years ago

Available via PyPi

pip install soundpy==0.1.0a2

Updates of v0.1.0a2 release:

Updated Dependencies

Updated dependencies to newest versions still compatible with Tensorflow 2.1.0
Note: bug in training with generators occurs with Tensorflow 2.2.0+. Models trained via generators fail to learn. Therefore, Tensorflow is limited to version 2.1.0 until that bug is fixed.

GPU option added

provide instructions for running Docker image for GPU

soundpy.dsp.vad

add use_beg_ms parameter: improved VAD recognition of silences post speech.
raise warning for sample rates lower than 44100 Hz. VAD seems to fail at lower sample rates.

soundpy.feats.get_vad_samples and soundpy.feats.get_vad_stft

moved from dsp module to the feats module
add extend_window_ms paremeter: can extend VAD window if desired. Useful in higher SNR environments.
raise warning for sample rates lower than 44100 Hz. VAD seems to fail at lower sample rates.

added soundpy.feats.get_samples_clipped and soundpy.feats.get_stft_clipped

another option for VAD
clips beginning and ending of audio data where high energy sound starts and ends.

soundpy.models.dataprep.GeneratorFeatExtraction

can extract and augment features from audio files as each audio file fed to model.
example can be viewed: soundpy.models.builtin.envclassifier_extract_train
note: still very experimental

soundpy.dsp.add_backgroundsound

improvements in the smoothness of the added signal.
soundpy.dsp.clip_at_zero
improved soundpy.dsp.vad and soundpy.feats.get_vad_stft

soundpy.feats.normalize

can use it: soundpy.normalize (don't need to remember dsp or feats)

soundpy.dsp.remove_dc_bias

implemented in soundpy.files.loadsound() and soundpy.files.savesound()
vastly improves the ability to work with and combine signals.

soundpy.dsp.clip_at_zero

clips beginning and ending audio at zero crossings (at negative to positive zero crossings)
useful when concatenating signals
useful for removing clicks at beginning or ending of audio signals

soundpy.dsp.apply_sample_length

can now mirror the sound as a form of sound extention with parameter mirror_sound.

Removed soundpy_online (and therefore mybinder as well)

for the time being, this is too much work to keep up. Eventually plan on bringing this back in a more maintainable manner.

Added stereo sound functionality to the following functions:

soundpy.dsp.add_backgroundsound
soundpy.dsp.clip_at_zero
soundpy.dsp.calc_fft
soundpy.feats.get_stft
soundpy.feats.get_vad_stft

soundpy.dsp.ismono for checking if a signal is mono or stereo
soundpy.dsp.average_channels for averaging amplitude in all channels (e.g. identifying when energetic sounds start / end: want to consider all channels)
soundpy.dsp.add_channels for adding additional channels if needed (e.g. for applying a 'hann' or 'hamming' window to stereo sound)

v0.1.0a1

3 years ago

This release coincides with the pypi release of pysoundtool-0.1.0a1.

Main adjustments include:

setting use_scipy defaults to False (use Librosa wherever possible)
setting dependency versions to avoid errors (numba and librosa; keras)

PySoundTool

3 years ago

An experimental Python framework for sound visualization, analysis, augmentation, filtering as well as machine learning.

Basic functionality for preparing audio datasets (e.g. formatting them), filtering audio, visualizing audio and its features (signal, stft, powspec, fbank, mfcc), augmenting audio for machine learning, and building/implementing basic neural networks for simple speech recognition, speech classification (e.g. language, gender or sex, emotion, etc.), and denoising.

Might be a bit buggy still.

keywords: audio file format conversion, dataset preparation, wiener filter, convolutional neural networks, cnn, conv, lstm, long short-term memory network, cnn+lstm, cnnlstm, convlstm, autoencoder, denoiser, speech recognition, environment classification, scene classification, language classification, denoising, augmentation, feature extraction, mel-filterbank energies, fbank, mel-frequency cepstral coefficients, mfcc, short-time fourier transfrom, stft, raw signal.

Python Sound Tool Versions Save

v0.1.0a2

Available via PyPi

Updates of v0.1.0a2 release:

Updated Dependencies

GPU option added

soundpy.dsp.vad

soundpy.feats.get_vad_samples and soundpy.feats.get_vad_stft

added soundpy.feats.get_samples_clipped and soundpy.feats.get_stft_clipped

soundpy.models.dataprep.GeneratorFeatExtraction

soundpy.dsp.add_backgroundsound

soundpy.feats.normalize

soundpy.dsp.remove_dc_bias

soundpy.dsp.clip_at_zero

soundpy.dsp.apply_sample_length

Removed soundpy_online (and therefore mybinder as well)

Added stereo sound functionality to the following functions:

New functions related to stereo sound

v0.1.0a1

PySoundTool