Faster Whisper Versions Save

Faster Whisper transcription with CTranslate2

v1.0.2

4 weeks ago

v1.0.1

3 months ago

v0.10.1

3 months ago

Fix the broken tag v0.10.0

v0.10.0

3 months ago
  • Support "large-v3" model with
    • The ability to load feature_size/num_mels and other from preprocessor_config.json
    • A new language token for Cantonese (yue)
  • Update CTranslate2 requirement to include the latest version 3.22.0
  • Update tokenizers requirement to include the latest version 0.15
  • Change the hub to fetch models from Systran organization

v1.0.0

3 months ago

0.10.0

6 months ago
  • Support "large-v3" model with
    • The ability to load feature_size/num_mels and other from preprocessor_config.json
    • A new language token for Cantonese (yue)
  • Update CTranslate2 requirement to include the latest version 3.22.0
  • Update tokenizers requirement to include the latest version 0.15
  • Change the hub to fetch models from Systran organization

v0.9.0

8 months ago
  • Add function faster_whisper.available_models() to list the available model sizes
  • Add model property supported_languages to list the languages accepted by the model
  • Improve error message for invalid task and language parameters
  • Update tokenizers requirement to include the latest version 0.14

v0.8.0

8 months ago

Expose new transcription options

Some generation parameters that were available in the CTranslate2 API but not exposed in faster-whisper:

  • repetition_penalty to penalize the score of previously generated tokens (set > 1 to penalize)
  • no_repeat_ngram_size to prevent repetitions of ngrams with this size

Some values that were previously hardcoded in the transcription method:

  • prompt_reset_on_temperature to configure after which temperature fallback step the prompt with the previous text should be reset (default value is 0.5)

Other changes

  • Fix a possible memory leak when decoding audio with PyAV by forcing the garbage collector to run
  • Add property duration_after_vad in the returned TranscriptionInfo object
  • Add "large" alias for the "large-v2" model
  • Log a warning when the model is English-only but the language parameter is set to something else

v0.7.1

10 months ago
  • Fix a bug related to no_speech_threshold: when the threshold was met for a segment, the next 30-second window reused the same encoder output and was also considered as non speech
  • Improve selection of the final result when all temperature fallbacks failed by returning the result with the best log probability

v0.7.0

10 months ago

Improve word-level timestamps heuristics

Some recent improvements from openai-whisper are ported to faster-whisper:

Support download of user converted models from the Hugging Face Hub

The WhisperModel constructor now accepts any repository ID as argument, for example:

model = WhisperModel("username/whisper-large-v2-ct2")

The utility function download_model has been updated similarly.

Other changes

  • Accept an iterable of token IDs for the argument initial_prompt (useful to include timestamp tokens in the prompt)
  • Avoid computing higher temperatures when no_speech_threshold is met (same as https://github.com/openai/whisper/commit/e334ff141d5444fbf6904edaaf408e5b0b416fe8)
  • Fix truncated output when using a prefix without disabling timestamps
  • Update the minimum required CTranslate2 version to 3.17.0 to include the latest fixes