Faster Whisper Versions Save

Faster Whisper transcription with CTranslate2

v1.0.1

2 months ago

v0.10.1

2 months ago

Fix the broken tag v0.10.0

v0.10.0

2 months ago
  • Support "large-v3" model with
    • The ability to load feature_size/num_mels and other from preprocessor_config.json
    • A new language token for Cantonese (yue)
  • Update CTranslate2 requirement to include the latest version 3.22.0
  • Update tokenizers requirement to include the latest version 0.15
  • Change the hub to fetch models from Systran organization

v1.0.0

2 months ago

0.10.0

5 months ago
  • Support "large-v3" model with
    • The ability to load feature_size/num_mels and other from preprocessor_config.json
    • A new language token for Cantonese (yue)
  • Update CTranslate2 requirement to include the latest version 3.22.0
  • Update tokenizers requirement to include the latest version 0.15
  • Change the hub to fetch models from Systran organization

v0.9.0

7 months ago
  • Add function faster_whisper.available_models() to list the available model sizes
  • Add model property supported_languages to list the languages accepted by the model
  • Improve error message for invalid task and language parameters
  • Update tokenizers requirement to include the latest version 0.14

v0.8.0

8 months ago

Expose new transcription options

Some generation parameters that were available in the CTranslate2 API but not exposed in faster-whisper:

  • repetition_penalty to penalize the score of previously generated tokens (set > 1 to penalize)
  • no_repeat_ngram_size to prevent repetitions of ngrams with this size

Some values that were previously hardcoded in the transcription method:

  • prompt_reset_on_temperature to configure after which temperature fallback step the prompt with the previous text should be reset (default value is 0.5)

Other changes

  • Fix a possible memory leak when decoding audio with PyAV by forcing the garbage collector to run
  • Add property duration_after_vad in the returned TranscriptionInfo object
  • Add "large" alias for the "large-v2" model
  • Log a warning when the model is English-only but the language parameter is set to something else

v0.7.1

9 months ago
  • Fix a bug related to no_speech_threshold: when the threshold was met for a segment, the next 30-second window reused the same encoder output and was also considered as non speech
  • Improve selection of the final result when all temperature fallbacks failed by returning the result with the best log probability

v0.7.0

9 months ago

Improve word-level timestamps heuristics

Some recent improvements from openai-whisper are ported to faster-whisper:

Support download of user converted models from the Hugging Face Hub

The WhisperModel constructor now accepts any repository ID as argument, for example:

model = WhisperModel("username/whisper-large-v2-ct2")

The utility function download_model has been updated similarly.

Other changes

  • Accept an iterable of token IDs for the argument initial_prompt (useful to include timestamp tokens in the prompt)
  • Avoid computing higher temperatures when no_speech_threshold is met (same as https://github.com/openai/whisper/commit/e334ff141d5444fbf6904edaaf408e5b0b416fe8)
  • Fix truncated output when using a prefix without disabling timestamps
  • Update the minimum required CTranslate2 version to 3.17.0 to include the latest fixes

v0.6.0

11 months ago

Extend TranscriptionInfo with additional properties

  • all_language_probs: the probability of each language (only set when language=None)
  • vad_options: the VAD options that were used for this transcription

Improve robustness on temporary connection issues to the Hugging Face Hub

When the model is loaded from its name like WhisperModel("large-v2"), a request is made to the Hugging Face Hub to check if some files should be downloaded.

It can happen that this request raises an exception: the Hugging Face Hub is down, the internet is temporarily disconnected, etc. These types of exception are now catched and the library will try to directly load the model from the local cache if it exists.

Other changes

  • Enable the onnxruntime dependency for Python 3.11 as the latest version now provides binary wheels for Python 3.11
  • Fix occasional IndexError on empty segments when using word_timestamps=True
  • Export __version__ at the module level
  • Include missing requirement files in the released source distribution