Whisper.cpp Versions Save

Port of OpenAI's Whisper model in C/C++

v1.2.0

1 year ago

Overview

In this release we significantly reduce the memory usage during inference by introducing "scratch" buffers to ggml.

The new memory requirements per model are as follows:

Model	Disk	Mem (Old)	Mem (New)
tiny	75 MB	~390 MB	~125 MB
base	142 MB	~500 MB	~210 MB
small	466 MB	~1.0 GB	~600 MB
medium	1.5 GB	~2.6 GB	~1.7 GB
large	2.9 GB	~4.7 GB	~3.3 GB

It's a simple idea that instead of creating a new memory buffer for each new tensor in the computation, we reuse the memory of old tensors that are no longer needed. The implementation is in PR #431. It's not very clean - I think there is some better way to do this, but for now it will work.

Additionally, there might be some inference speed improvements on Apple Silicon in the Decoder part of the transformer. I haven't done proper benchmarks, but seems there is about ~30% performance boost. The results are identical to v1.1.1.

What's Changed

Core `ggml` / `whisper`

whisper : PPC64 big-endian support by @fitzsim in https://github.com/ggerganov/whisper.cpp/pull/398
whisper : condition sampled timestamp tokens to be monotonically increasing by @ggerganov in https://github.com/ggerganov/whisper.cpp/pull/425
wasm : fix typo in helper.js by @bhbs in https://github.com/ggerganov/whisper.cpp/pull/459
ggml/whisper : reduce memory usage during inference by @ggerganov in https://github.com/ggerganov/whisper.cpp/pull/431

Bindings

ci : run workflows on pull requests + bindings depend on .h by @ggerganov in https://github.com/ggerganov/whisper.cpp/pull/446
go : added wrappers to reset and print timings by @glaslos in https://github.com/ggerganov/whisper.cpp/pull/436
go : add WhisperLangAutoDetect method to go binding by @RobinXL in https://github.com/ggerganov/whisper.cpp/pull/451
go : add wrapper for system info by @glaslos in https://github.com/ggerganov/whisper.cpp/pull/456
go : support "auto" as an option when set language by @polarmoon in https://github.com/ggerganov/whisper.cpp/pull/462

Examples

whisper.wasm : add labels for easier radio selection by @kokes in https://github.com/ggerganov/whisper.cpp/pull/435
livestream.sh : run main with model arg instead of default by @EricTendian in https://github.com/ggerganov/whisper.cpp/pull/453
main : CSV format export trimmed spaces fix by @alex-bacart in https://github.com/ggerganov/whisper.cpp/pull/444
addon.node : using whisper as a Node.js addon by @chenqianhe in https://github.com/ggerganov/whisper.cpp/pull/443

New Contributors

@kokes made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/435
@glaslos made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/436
@EricTendian made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/453
@RobinXL made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/451
@alex-bacart made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/444
@bhbs made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/459
@polarmoon made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/462
@chenqianhe made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/443

Full Changelog: https://github.com/ggerganov/whisper.cpp/compare/v1.1.1...v1.2.0

Highlights

I'll use these release notes to write some random thoughts about the project - sort of a short blog post.

I'm really happy with how whisper.cpp turned out to be so far. There is a very positive reception in the ML community - most people seem to be excited by the simplicity of the implementation and the fact that it is quite self-contained. I receive a lot of questions about the project and about various ideas that it can be applied to. I really enjoy it and I try to respond to everyone!

I also find it very satisfying that there are so many contributions already happening by so many people. To me this illustrates the power of open-source collaboration. The contributions not only improve the functionality and the quality of the code, but also help to generate various new ideas and approaches to explore.

Another interesting thing is that the project keeps on giving. Every time I start to think that now is a good time to put it in the background for a while and focus on other stuff, some new cool idea pops up and I can't help but start working on it. Having this custom implementation allows me to interact with the model on a lower level which opens some interesting ways to explore it.

So far the development has been focused on improving the performance, expanding the platform coverage and having robust decoding strategies with a variety of examples. During this time, there have been several ideas that accumulated over-time which I find interesting to explore (diarization, token-level timestamps, improved timestamp accuracy, etc). I think I'll try to focus more on these in the future and see if I can achieve something interesting.

Windows port of whisper.cpp utilising vendor-agnostic GPGPU based on DirectCompute by @Const-me

https://github.com/Const-me/Whisper

"The New Yorker" article featuring whisper.cpp

Whispers of A.I.’s Modular Future

v1.1.1

1 year ago

Overview

Since the v1.1.0 pre-release there have been several reports of improved transcription quality. Together with my observations, I think we can declare version v1.1.1 as "stable".

There were actually a couple of bug-fixes implemented since v1.1.0, so make sure to update to v1.1.1 for optimal results.

Another update is that the prototype for v1.2.0 is almost ready: https://github.com/ggerganov/whisper.cpp/pull/431 Initial results indicate that the memory usage can be reduced by a factor of 2-3 for the smaller models.

You can provide feedback in the existing v1.1.0 discussion.

What's Changed

Core `ggml` / `whisper`

whisper : perform entropy check only when we have at least 32 tokens 1a91c19af929d6dc614a9f3b03026fb23be002a6
whisper : fix condition for providing past prompt (critical) 78f166174f126345ed87cc8f6941af1905c4a0f2

Bindings

go : remove sample_best and sample_timestamp bindings by @Trojan295 in https://github.com/ggerganov/whisper.cpp/pull/409

Examples

main : re-enable temperature fallback f583e2d2f5a60e6ebf5bb2819ba4c4d348d41ea2
main : add an option to accept optional output filenames by @garychia in https://github.com/ggerganov/whisper.cpp/pull/424
whisper.android : use AssetManager for Android by @Digipom in https://github.com/ggerganov/whisper.cpp/pull/415
whisper.wasm : add small and small.en models 206fc93396936725bd362c93796cfdc8a87f8509
bench : add memcpy and ggml_mul_mat benchmarks (experimental) 1290fc64572f434f2f36721d2e2b0913cec0178a

New Contributors

@Trojan295 made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/409
@garychia made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/424

Full Changelog: https://github.com/ggerganov/whisper.cpp/compare/v1.1.0...v1.1.1

v1.1.0

1 year ago

Overview

The major change in this pre-release is the improved decoding implementation in whisper.cpp:

Support for average logprob and entropy based criteria for fallback
Support for temperature T > 0
Improved Greedy decoder via best_of parameter for T > 0
Add beam search decoding (a.k.a beam_size)

More information about the decoding changes can be found in #291 Additionally, there are a few performance improvements for Apple Silicon, WASM and non-F16C platforms. Support for POWER9 architectures has been added.

The reason that this is a pre-release and not an official release is that the new implementation has not been sufficiently tested yet and the existing bindings for other languages have not been updated to support the API changes. The official release 1.1.x will be created when there is enough feedback about the new decoding implementation and when the bindings have been updated. So make sure to send your feedback in the discussion created for this pre-release. For now, the 1.0.4 release should be considered more stable.

What's Changed

Core `ggml` / `whisper`

ggml : POWER9 support by @fitzsim in #320, #349, #369
ggml : simplify the SIMD code by @ggerganov in #324
ggml : add SSE3 and fp16 conversion lookup table by @abitofevrything in #368
ggml : utilise Accelerate's vDSP for some computations d51fc3ee0a0038cdf1522ca3d58b58299de41eb8
ggml : speed-up softmax compute via Accelerate and loop unrolling d61d55cd4b9fe77511c8eea28d0220ce552f7008
ggml : do not start extra threads when using BLAS d347a59a5f224f6a5ab0084ec95715451972d3b0
whisper : do sample_to_timestamp calculation with 64 bit precision to avoid overflow by @boolemancer in #388
whisper : various code clean-up and improvements by @asmaloney in #317 #318 #319 #322 etc
whisper : improve decoding by @ggerganov in #291
whisper : account for speed_up flag for short audio #405

C-style API

Add loader class to allow loading from buffer and others by @prsyahmi in https://github.com/ggerganov/whisper.cpp/pull/353
Add whisper_token_data::plog
Add whisper_init_from_file()
Add whisper_init_from_buffer()
Change whisper_init()
Remove whisper_sample_best()
Remove whisper_sample_timestamp()
Add whisper_n_audio_ctx()
Add whisper_get_logits()
Remove whisper_get_probs()
Change struct whisper_full_params

Bindings

Golang bindings by @djthorpe in #287, #379, #384

Examples

whisper.android : remove android ABI constraint by @Digipom in #301
whisper.swiftui : SwiftUI example by @Digipom in #308
main : add -ocsv, aka --output-csv for writing CSV file containing millisecond timestamps by @NielsMayer in #340
command : refactor to split command list & general transcription modes by @asmaloney in #331
command : always-prompt mode by @dnhkng in #383
stream : fix data race on bool + avoid division-by-zero a466c3404dc62dc221061bb37fb8f78741d749b8
stream : fix a bug that inserted a lot of empty audio at the start a6dbd9188b13378dc36e2c669b9a22e17b4201d1
bench.wasm : print system info fafd78945d5a7ea11ffa31fa6c05dd6593b7d031

New Contributors

@djthorpe made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/287
@0xmohit made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/296
@asmaloney made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/298
@fitzsim made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/320
@NielsMayer made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/340
@aviks made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/345
@eltociear made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/346
@abitofevrything made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/368
@Mike-Bell made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/381
@dnhkng made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/383
@prsyahmi made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/353
@ianb made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/391

Full Changelog: https://github.com/ggerganov/whisper.cpp/compare/v1.0.4...v1.1.0

Highlights

Sample SwiftUI application example/whisper.swiftui

1.0.4

1 year ago

What's Changed

Core `ggml` / `whisper`

Make ggml compatible with c99 9955fa4ed7cc694d5d47fe0bb5f0d02066f9cbac | 0f117594066a213cc3cc9261c8906f316e6fb153
Fix UB causing asserts in Debug when reading the model vocabulary 124c718c73f915f3e4235ae2af8841356e76177d
Minor improvements in the Greedy decoding strategy 6a7c82501e3794724ba80bfb9a983810af036803
Add Windows build without OpenBLAS by @ggerganov in https://github.com/ggerganov/whisper.cpp/pull/282
Add whisper_tokenize() - basic text tokenization bf69b669a00e457b6bfa69b97f1fdf2578d3e403
Language auto-detect option by @ggerganov in https://github.com/ggerganov/whisper.cpp/pull/286
Add AVX,AVX2 support for ggml_vec_scale_f32 by @katsu560 in https://github.com/ggerganov/whisper.cpp/pull/285
Implement extra cases for ggml_compute_forward_dup_f16() a7047b2a28a8eccb94318eca8a3207894d3822c7
Added Roadmap and updated F.A.Q. discussion #126

C-style API

Add whisper_tokenize()
Add whisper_lang_max_id()
Add whisper_lang_str()
Add whisper_lang_auto_detect()
Add whisper_token_lang()

Examples

Improve prompting in "talk" example a613f16aec81b7715cdbd4386ba62ab2ff1216b3
Add "sliding window" mode to "stream" example b0f8013eb9f371b500abf1e3c506399ce7f59b11
Add Android sample by @Digipom in https://github.com/ggerganov/whisper.cpp/pull/277
Guided mode for the "command" example by @ggerganov in https://github.com/ggerganov/whisper.cpp/pull/271
Example "main" supports --prompt option b8065d90f5fdcdb445a8fb3f4717cba54c332cac
Example "main" supports --print-progress option 32fbc8cd04912904cf84af7c5bd0e0e711a6f021
Example "main" supports --lang auto option fba10a4c68f0533a339174ef81c6a18ea228d331

New Contributors

@Digipom made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/277

Full Changelog: https://github.com/ggerganov/whisper.cpp/compare/1.0.3...1.0.4

Highlights

Sample Android application example/whisper.android

General-purpose, short voice command detection on Raspberry Pi 4 using example/command:

https://user-images.githubusercontent.com/1991296/208255185-6e9d60ea-4bc8-4b64-b731-8ca9f3b7333b.mp4

v1.0.4

1 year ago

What's Changed

Core `ggml` / `whisper`

Make ggml compatible with c99 9955fa4ed7cc694d5d47fe0bb5f0d02066f9cbac | 0f117594066a213cc3cc9261c8906f316e6fb153
Fix UB causing asserts in Debug when reading the model vocabulary 124c718c73f915f3e4235ae2af8841356e76177d
Minor improvements in the Greedy decoding strategy 6a7c82501e3794724ba80bfb9a983810af036803
Add Windows build without OpenBLAS by @ggerganov in https://github.com/ggerganov/whisper.cpp/pull/282
Add whisper_tokenize() - basic text tokenization bf69b669a00e457b6bfa69b97f1fdf2578d3e403
Language auto-detect option by @ggerganov in https://github.com/ggerganov/whisper.cpp/pull/286
Add AVX,AVX2 support for ggml_vec_scale_f32 by @katsu560 in https://github.com/ggerganov/whisper.cpp/pull/285
Implement extra cases for ggml_compute_forward_dup_f16() a7047b2a28a8eccb94318eca8a3207894d3822c7
Added Roadmap and updated F.A.Q. discussion #126

C-style API

Add whisper_tokenize()
Add whisper_lang_max_id()
Add whisper_lang_str()
Add whisper_lang_auto_detect()
Add whisper_token_lang()

Examples

Improve prompting in "talk" example a613f16aec81b7715cdbd4386ba62ab2ff1216b3
Add "sliding window" mode to "stream" example b0f8013eb9f371b500abf1e3c506399ce7f59b11
Add Android sample by @Digipom in https://github.com/ggerganov/whisper.cpp/pull/277
Guided mode for the "command" example by @ggerganov in https://github.com/ggerganov/whisper.cpp/pull/271
Example "main" supports --prompt option b8065d90f5fdcdb445a8fb3f4717cba54c332cac
Example "main" supports --print-progress option 32fbc8cd04912904cf84af7c5bd0e0e711a6f021
Example "main" supports --lang auto option fba10a4c68f0533a339174ef81c6a18ea228d331

New Contributors

@Digipom made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/277

Full Changelog: https://github.com/ggerganov/whisper.cpp/compare/1.0.3...1.0.4

Highlights

Sample Android application example/whisper.android

General-purpose, short voice command detection on Raspberry Pi 4 using example/command:

https://user-images.githubusercontent.com/1991296/208255185-6e9d60ea-4bc8-4b64-b731-8ca9f3b7333b.mp4

Whisper.cpp Versions Save

v1.2.0

Overview

What's Changed

Core ggml / whisper

Bindings

Examples

New Contributors

Highlights

Whispers of A.I.’s Modular Future

v1.1.1

Overview

What's Changed

Core ggml / whisper

Bindings

Examples

New Contributors

v1.1.0

Overview

What's Changed

Core ggml / whisper

C-style API

Bindings

Examples

New Contributors

Highlights

1.0.4

What's Changed

Core ggml / whisper

C-style API

Examples

New Contributors

Highlights

v1.0.4

What's Changed

Core ggml / whisper

C-style API

Examples

New Contributors

Highlights

Core `ggml` / `whisper`

Core `ggml` / `whisper`

Core `ggml` / `whisper`

Core `ggml` / `whisper`

Core `ggml` / `whisper`