Speech Recognition Versions Save

Speech recognition module for Python, supporting several engines and APIs, online and offline.

3.10.3

1 month ago

SpeechRecognition 3.10.3 was out🎉 Get all of these and more with a quick pip install --upgrade SpeechRecognition. Enjoy!

What's Changed

Improvements

Tweak installation by @ftnext in https://github.com/Uberi/speech_recognition/pull/740
- Support pip install SpeechRecognition[whisper-local]
- Support pip install SpeechRecognition[whisper-api]
Add tests with mock by @ftnext (#738, #739)

Full Changelog: https://github.com/Uberi/speech_recognition/compare/3.10.2...3.10.3

3.10.2

1 month ago

SpeechRecognition 3.10.2 was out🎉 Get all of these and more with a quick pip install --upgrade SpeechRecognition. Enjoy!

What's Changed

Bugfixes

Updated to the latest OpenAI API changes, and fixed #720 by @herrjemand in https://github.com/Uberi/speech_recognition/pull/729

New Contributors

@herrjemand made their first contribution in https://github.com/Uberi/speech_recognition/pull/729

Thanks to all contributors!

Full Changelog: https://github.com/Uberi/speech_recognition/compare/3.10.1...3.10.2

3.10.1

4 months ago

SpeechRecognition 3.10.1 was out🎉 Get all of these and more with a quick pip install --upgrade SpeechRecognition. Enjoy!

What's Changed

New features

Support Python 3.11

Improvements

Refactor recognize_google by @ftnext in https://github.com/Uberi/speech_recognition/pull/721

Thanks to all contributors!

Full Changelog: https://github.com/Uberi/speech_recognition/compare/3.10.0...3.10.1

3.10.0

1 year ago

SpeechRecognition 3.10.0 was out🎉 Get all of these and more with a quick pip install --upgrade SpeechRecognition. Enjoy!

What's Changed

New features

Support Whisper API by @ftnext in https://github.com/Uberi/speech_recognition/pull/669

Improvements

Thanks❤️

Replace with in-memory stream on recognize_whisper by @ftnext in https://github.com/Uberi/speech_recognition/pull/647
Remove prints that shouldn't be printed by default by @kuzmoyev in https://github.com/Uberi/speech_recognition/pull/651
Codebase is under refactoring...

Deprecations

Drop inactive Python by @ftnext in https://github.com/Uberi/speech_recognition/pull/650
- SpeechRecognition currently supports Python 3.8+

New Contributors

@kuzmoyev made their first contribution in https://github.com/Uberi/speech_recognition/pull/651

Thanks to all contributors!

Full Changelog: https://github.com/Uberi/speech_recognition/compare/3.9.0...3.10.0

3.9.0

1 year ago

SpeechRecognition 3.9.0 was out on December 2022🎉 Get all of these and more with a quick pip install --upgrade SpeechRecognition. Enjoy!

What's Changed

New features

Thanks for making SpeechRecognition even more wonderful! 🙌

Add recognize_tensorflow by @chriamue in https://github.com/Uberi/speech_recognition/pull/296
Add recognize_vosk by @mytja in https://github.com/Uberi/speech_recognition/pull/513
Add recognize_amazon and recognize_assemblyai by @chrisspen in https://github.com/Uberi/speech_recognition/pull/434
Add recognize_whisper by @joy-void-joy in https://github.com/Uberi/speech_recognition/pull/625

Bugfixes & improvements

Thanks!👏

Update to speechContext formatting for recognize_google_cloud by @dcam0050 in https://github.com/Uberi/speech_recognition/pull/304
Fix for OSError: [Errno -9988] Stream closed Error by @chriamue in https://github.com/Uberi/speech_recognition/pull/306
Add paramater to change profanity filter level for Google Speech Recognition by @jorgegarciadev in https://github.com/Uberi/speech_recognition/pull/363
Updating Wit API version (20160526 -> 20170307) by @Franck-Dernoncourt in https://github.com/Uberi/speech_recognition/pull/344
Google cloud speech library by @frnsys in https://github.com/Uberi/speech_recognition/pull/406
Fix large cpu consumption in snowboy detect by @Aculeasis in https://github.com/Uberi/speech_recognition/pull/395
Replace Bing Speech API with Azure Speech API by @lastcoolnameleft in https://github.com/Uberi/speech_recognition/pull/389
Removed duplicate code by @jhoelzl in https://github.com/Uberi/speech_recognition/pull/321
fix recognize_google_cloud by @alinerguio in https://github.com/Uberi/speech_recognition/pull/601
Pin pocketsphinx temporarily by @ftnext in https://github.com/Uberi/speech_recognition/pull/627
Specify fp16 parameter for whisper by @ftnext in https://github.com/Uberi/speech_recognition/pull/630

Documentation improvements

Thanks!❤️

Update pocketsphinx.rst by @fygul in https://github.com/Uberi/speech_recognition/pull/396
docs: fix simple typo, covnert -> convert by @timgates42 in https://github.com/Uberi/speech_recognition/pull/536
Update pocketsphinx.rst by @fygul in https://github.com/Uberi/speech_recognition/pull/435

Improvements for developers

Fix Travis build by @native-api in https://github.com/Uberi/speech_recognition/pull/418 (Thanks!)
Fix unit tests of recognize_google method by @ftnext in https://github.com/Uberi/speech_recognition/pull/619

New Contributors

@dcam0050 made their first contribution in https://github.com/Uberi/speech_recognition/pull/304
@chriamue made their first contribution in https://github.com/Uberi/speech_recognition/pull/296
@jorgegarciadev made their first contribution in https://github.com/Uberi/speech_recognition/pull/363
@Franck-Dernoncourt made their first contribution in https://github.com/Uberi/speech_recognition/pull/344
@fygul made their first contribution in https://github.com/Uberi/speech_recognition/pull/396
@frnsys made their first contribution in https://github.com/Uberi/speech_recognition/pull/406
@Aculeasis made their first contribution in https://github.com/Uberi/speech_recognition/pull/395
@lastcoolnameleft made their first contribution in https://github.com/Uberi/speech_recognition/pull/389
@native-api made their first contribution in https://github.com/Uberi/speech_recognition/pull/418
@mytja made their first contribution in https://github.com/Uberi/speech_recognition/pull/513
@alinerguio made their first contribution in https://github.com/Uberi/speech_recognition/pull/601
@chrisspen made their first contribution in https://github.com/Uberi/speech_recognition/pull/434
@timgates42 made their first contribution in https://github.com/Uberi/speech_recognition/pull/536
@joy-void-joy made their first contribution in https://github.com/Uberi/speech_recognition/pull/625

Thanks to all contributors!

Full Changelog: https://github.com/Uberi/speech_recognition/compare/3.8.1...3.9.0

3.8.1

6 years ago

Lots of changes since June! Summary below. Get all of these and more with a quick pip install --upgrade SpeechRecognition.

Snowboy hotwords support for highly efficient, performant listening (thanks @beeedy!). This is implemented as the snowboy_configuration parameter of recognizer_instance.listen.
Configurable Pocketsphinx models - you can now specify your own acoustic parameters, language model, and phoneme dictionary, using the language parameter of recognizer_instance.recognize_sphinx (thanks @frawau!).
audio_data_instance.get_segment(start_ms=None, end_ms=None) is a new method that can be called on any AudioData instance to get a segment of the audio starting at start_ms and ending at end_ms. This is really useful when you want to get, say, only the first five seconds of some audio.
The stopper function returned by listen_in_background now accepts one parameter, wait_for_stop (defaulting to True for backwards compatibility), which determines whether the function will wait for the background thread to fully shutdown before returning. One advantage is that if wait_for_stop is False, you can call the stopper function from any thread!
New example, demonstrating how to simultaneously listen to and recognize speech with the threaded producer/consumer pattern: threaded_workers.py.
Various improvements and bugfixes:
- Python 3 style type annotations in library documentation.
- recognize_google_cloud now uses the v1 rather than the beta API (thanks @oort7!).
- recognize_google_cloud now returns timestamp info when the show_all parameter is True.
- recognize_bing won't time out as often on credential requests, due to a longer default timeout.
- recognize_google_cloud timeouts respect recognizer_instance.operation_timeout now (thanks @reefactor!).
- Any recognizers using FLAC audio were broken inside Linux on Docker - this is now fixed (thanks @reefactor!).
- Various documentation and lint fixes (thanks @josh-hernandez-exe!).
- Lots of small build system improvements.

3.7.1

6 years ago

As usual, get it with pip install --upgrade SpeechRecognition

New grammar parameter for recognizer_instance.recognize_sphinx - now, you can specify a JSGF or FSG grammar to PocketSphinx (thanks @aleneum!).
Update PyAudio to version 0.2.11 - this fixes a couple memory management issues users have been experiencing.
Update FLAC to 1.3.2 on all platforms - this will make it easier to support more audio formats in the near future.
Fixes for various APIs on Python 3.6+ - small changes in urllib.request behavior made requests fail in certain situations.
Fixes for Bing Speech API timing out due to some backwards incompatible changes to their API.
Restore original IBM audio segmentation behaviour - previously, it would stop recognizing after the first pause. Now, it will recognize all speech in the input audio, as it did before IBM's changes.
Fix links in PocketSphinx docs and library reference. Add-on language models now available from Google Drive, including the now-officially-supported Italian model.
New troubleshooting entries for JACK server in README.
Documentation and build process updates.

3.6.5

7 years ago

Quick bugfix for PortableNamedTemporaryFile:

Fix file descriptor opening on Python 2.
Add tests for Sphinx keyword matching.

3.6.4

7 years ago

Bugfix release!

Fix tempfile.NamedTemporaryFile on Windows, by replacing it with a PortableNamedTemporaryFile class. Previously, it didn't necessarily support the file being re-opened after originally opened.
Documentation/troubleshooting improvements (thanks @hassanmian!).
Add support for 24-bit FLAC audio files (thanks @sudevschiz!).
Fix phrase_time_limit being ignored for listen_in_background (thanks @dodysw!)
Added lots of new audio regression tests.
Code cleanup for tests and examples.

3.6.3

7 years ago

Small bugfix release:

Handle case when GSR doesn't return a confidence value (thanks @jcsilva!).
Config, style, and release improvements.
Fix console window sometimes popping up when on Windows (thanks @Qdrew!)
Switch release over to universal Wheels rather than source distribution.