Whisper.cpp Versions Save

Port of OpenAI's Whisper model in C/C++

v1.5.5

1 week ago

Overview

Many small incremental updates + Token level timestamps with DTW by @denersc in https://github.com/ggerganov/whisper.cpp/pull/1485 Feedback is welcome!

Full Changelog: https://github.com/ggerganov/whisper.cpp/compare/v1.5.4...v1.5.5

What's Changed

server : fix server temperature + add temperature_inc by @ggerganov in https://github.com/ggerganov/whisper.cpp/pull/1729
main : add cli option to disable system prints by @ggerganov in https://github.com/ggerganov/whisper.cpp/pull/1740
server: add request path by @eschmidbauer in https://github.com/ggerganov/whisper.cpp/pull/1741
Optional Piper TTS support for talk-llama example. by @RhinoDevel in https://github.com/ggerganov/whisper.cpp/pull/1749
fix/1748 by @nank1ro in https://github.com/ggerganov/whisper.cpp/pull/1750
Don't compute timestamps when not printing them. by @ghindle in https://github.com/ggerganov/whisper.cpp/pull/1755
Add more parameters to server api by @ghindle in https://github.com/ggerganov/whisper.cpp/pull/1754
Add SetInitialPrompt method to go bindings by @blib in https://github.com/ggerganov/whisper.cpp/pull/1753
ggml : fix 32-bit ARM compat for IQ2_XS by @ggerganov in https://github.com/ggerganov/whisper.cpp/pull/1758
refactor: get all scripts to be POSIX Compliant by @sonphantrung in https://github.com/ggerganov/whisper.cpp/pull/1725
whisper : load the model into multiple buffers of max size 1GB by @ggerganov in https://github.com/ggerganov/whisper.cpp/pull/1763
rebase against your -np changes (thx) and add better python file to be used on the command line or as library by @contractorwolf in https://github.com/ggerganov/whisper.cpp/pull/1744
examples/talk-llama: Add optional commandline parameter to set the bot name. by @RhinoDevel in https://github.com/ggerganov/whisper.cpp/pull/1764
server : fix building and simplify lib deps on Windows by @przemoc in https://github.com/ggerganov/whisper.cpp/pull/1772
talk-llama: optional wake-up command and audio confirmation by @Rakksor in https://github.com/ggerganov/whisper.cpp/pull/1765
examples/server: implement "verbose_json" format with token details by @rmmh in https://github.com/ggerganov/whisper.cpp/pull/1781
whisper.android: Return output from benchmarks by @luciferous in https://github.com/ggerganov/whisper.cpp/pull/1785
libwhisper.so should be position independent by @trixirt in https://github.com/ggerganov/whisper.cpp/pull/1792
Docs: try to make model options / model install methods clearer by @mrienstra in https://github.com/ggerganov/whisper.cpp/pull/1806
common : fix input buffer check by @ggerganov in https://github.com/ggerganov/whisper.cpp/pull/1812
Update Makefile by @jwijffels in https://github.com/ggerganov/whisper.cpp/pull/1813
Add fields to verbose_json response and show examples on the home page by @JacobLinCool in https://github.com/ggerganov/whisper.cpp/pull/1802
common: fix wav buffer detection by @JacobLinCool in https://github.com/ggerganov/whisper.cpp/pull/1819
Add macOS deployment target option to Makefile by @didzis in https://github.com/ggerganov/whisper.cpp/pull/1839
Expose CUDA device setting in public API by @didzis in https://github.com/ggerganov/whisper.cpp/pull/1840
whisper.android: How to build with CLBlast by @luciferous in https://github.com/ggerganov/whisper.cpp/pull/1809
server: Allow CORS request with authorization headers by @valenting in https://github.com/ggerganov/whisper.cpp/pull/1850
Embed Metal library source into compiled binary by @didzis in https://github.com/ggerganov/whisper.cpp/pull/1842
added audio_ctx argument to main and server examples by @dscripka in https://github.com/ggerganov/whisper.cpp/pull/1857
whisper : fix external encoder by @ggerganov in https://github.com/ggerganov/whisper.cpp/pull/1860
swift : package no longer use ggml dependency by @ggerganov in https://github.com/ggerganov/whisper.cpp/pull/1861
fix openvino setup docs by @jumpers775 in https://github.com/ggerganov/whisper.cpp/pull/1874
clean up common code in examples by @felrock in https://github.com/ggerganov/whisper.cpp/pull/1871
main : check if input files exist before proceeding by @Theldus in https://github.com/ggerganov/whisper.cpp/pull/1872
Linking issue fix via Makefile when CUBLAS enabled in the WSL #1876 by @lbluep in https://github.com/ggerganov/whisper.cpp/pull/1878
main : fix file existence check in main.cpp by @Theldus in https://github.com/ggerganov/whisper.cpp/pull/1889
openvino : fix convert-whisper-to-openvino.py for v2023.0.0 (#1870) by @st-gr in https://github.com/ggerganov/whisper.cpp/pull/1890
ggml : 32-bit arm compat by @ggerganov in https://github.com/ggerganov/whisper.cpp/pull/1891
Add SYCL logic in whisper by @abhilash1910 in https://github.com/ggerganov/whisper.cpp/pull/1863
talk and talk-llama: Pass text_to_speak as a file by @tamo in https://github.com/ggerganov/whisper.cpp/pull/1865
Stream.wasm: Fix invalid memory access when no segments are returned by @Andrews54757 in https://github.com/ggerganov/whisper.cpp/pull/1902
Update README to Recommend MacOS Sonoma for Core ML to avoid hallucination by @gavin1818 in https://github.com/ggerganov/whisper.cpp/pull/1917
Add library versioning by @kenneth-ge in https://github.com/ggerganov/whisper.cpp/pull/1352
Fix SF(segment fault) issue in Android JNI by @zhouwg in https://github.com/ggerganov/whisper.cpp/pull/1929
Fix typo in source file whisper.cpp by @zhouwg in https://github.com/ggerganov/whisper.cpp/pull/1925
bench:fix typo by @zhouwg in https://github.com/ggerganov/whisper.cpp/pull/1933
Auto lowercase language parameter by @F1L1Pv2 in https://github.com/ggerganov/whisper.cpp/pull/1928
ggml : try fix 32-bit arm compat by @ggerganov in https://github.com/ggerganov/whisper.cpp/pull/1938
whisper : make beam candidate sort more stable by @josharian in https://github.com/ggerganov/whisper.cpp/pull/1943
bindings/go : add linker flags to make metal work by @josharian in https://github.com/ggerganov/whisper.cpp/pull/1944
whisper : improve beam search candidate diversity by @josharian in https://github.com/ggerganov/whisper.cpp/pull/1947
whisper : document whisper_batch.n_seq_id by @josharian in https://github.com/ggerganov/whisper.cpp/pull/1942
Rename --audio-context to --audio-ctx, as per help text by @joliss in https://github.com/ggerganov/whisper.cpp/pull/1953
[DRAFT] Token level timestamps with DTW (#375) by @denersc in https://github.com/ggerganov/whisper.cpp/pull/1485
Fedora dependencies needed (SDL2) by @Man2Dev in https://github.com/ggerganov/whisper.cpp/pull/1970
libcuda.so.1 in PATH in Docker Container by @tiagofassoni in https://github.com/ggerganov/whisper.cpp/pull/1966
ruby : fix build by @ggerganov in https://github.com/ggerganov/whisper.cpp/pull/1980
Improve support for distil-large-v3 by @sanchit-gandhi in https://github.com/ggerganov/whisper.cpp/pull/1982
whisper : improve handling of prompts by @ggerganov in https://github.com/ggerganov/whisper.cpp/pull/1981
sync : ggml by @ggerganov in https://github.com/ggerganov/whisper.cpp/pull/2001
Implemented command-style grammar in the main example. by @ulatekh in https://github.com/ggerganov/whisper.cpp/pull/1998
Use pkg-config for OpenBLAS by @przemoc in https://github.com/ggerganov/whisper.cpp/pull/1778
ci : add building in MSYS2 environments (Windows) by @przemoc in https://github.com/ggerganov/whisper.cpp/pull/1994
Support CUDA versions < 11.1 by @primenko-v in https://github.com/ggerganov/whisper.cpp/pull/2020
Create solution folders in the CMake build by @ulatekh in https://github.com/ggerganov/whisper.cpp/pull/2004
Allow a regular expression to describe tokens to suppress by @ulatekh in https://github.com/ggerganov/whisper.cpp/pull/1997
"main" example now allows a response-file as the sole parameter by @ulatekh in https://github.com/ggerganov/whisper.cpp/pull/2019
Support for CPU BLAS build via Intel MKL by @slashlib in https://github.com/ggerganov/whisper.cpp/pull/2024
Set stdin to binary mode on Windows. Fixes #2023 by @rotemdan in https://github.com/ggerganov/whisper.cpp/pull/2025
Fix file-handle leak in read_wav() by @ulatekh in https://github.com/ggerganov/whisper.cpp/pull/2026
Fix DTW memory access by @bradmurray-dt in https://github.com/ggerganov/whisper.cpp/pull/2012
whisper: update grammar-parser.cpp by @eltociear in https://github.com/ggerganov/whisper.cpp/pull/2058
fix missing reference to "model" variable in actual shell command run in whisper.nvim by @sixcircuit in https://github.com/ggerganov/whisper.cpp/pull/2049
build : detect AVX512 in Makefile, add AVX512 option in CMake by @didzis in https://github.com/ggerganov/whisper.cpp/pull/2043
feature/no timestamps node by @pprobst in https://github.com/ggerganov/whisper.cpp/pull/2048
Update embedded Metal library generation process to include dependency by @didzis in https://github.com/ggerganov/whisper.cpp/pull/2045
server.cpp: add dtw by @eschmidbauer in https://github.com/ggerganov/whisper.cpp/pull/2044

New Contributors

@eschmidbauer made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/1741
@RhinoDevel made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/1749
@nank1ro made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/1750
@ghindle made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/1755
@blib made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/1753
@sonphantrung made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/1725
@contractorwolf made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/1744
@Rakksor made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/1765
@rmmh made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/1781
@luciferous made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/1785
@trixirt made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/1792
@mrienstra made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/1806
@JacobLinCool made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/1802
@valenting made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/1850
@dscripka made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/1857
@jumpers775 made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/1874
@Theldus made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/1872
@lbluep made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/1878
@st-gr made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/1890
@abhilash1910 made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/1863
@Andrews54757 made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/1902
@gavin1818 made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/1917
@kenneth-ge made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/1352
@zhouwg made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/1929
@F1L1Pv2 made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/1928
@josharian made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/1943
@joliss made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/1953
@Man2Dev made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/1970
@tiagofassoni made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/1966
@sanchit-gandhi made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/1982
@ulatekh made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/1998
@primenko-v made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/2020
@slashlib made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/2024
@rotemdan made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/2025
@bradmurray-dt made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/2012
@sixcircuit made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/2049
@pprobst made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/2048

Full Changelog: https://github.com/ggerganov/whisper.cpp/compare/v1.5.4...v1.5.5

v1.5.4

3 months ago

Overview

Faster Core ML ANE models (#1716)
CUDA bugfix causing random erros in the transcription
Fix SwiftUI example build

Full Changelog: https://github.com/ggerganov/whisper.cpp/compare/v1.5.3...v1.5.4

v1.5.3

3 months ago

Overview

Minor maintenance release:

Fix CUDA issues where the transcription produces garbage
FIX quantized models to work with CUDA backend
Allow to use whisper.cpp and llama.cpp together in SwiftUI projects

What's Changed

Update bench.py by @ForkedInTime in https://github.com/ggerganov/whisper.cpp/pull/1655
cmake : Resolve quantized model issue when CUBLAS enabled by @bobqianic in https://github.com/ggerganov/whisper.cpp/pull/1667
examples : Revert CMakeLists.txt for talk-llama by @bobqianic in https://github.com/ggerganov/whisper.cpp/pull/1669
CI : Add coverage for talk-llama when WHISPER_CUBLAS=1 by @bobqianic in https://github.com/ggerganov/whisper.cpp/pull/1672
ci: build and push docker image by @OpenWaygate in https://github.com/ggerganov/whisper.cpp/pull/1674
sync : ggml (ggml_scale, ggml_row_size, etc.) by @ggerganov in https://github.com/ggerganov/whisper.cpp/pull/1677
Replace WHISPER_PRINT_DEBUG with WHISPER_LOG_DEBUG by @bobqianic in https://github.com/ggerganov/whisper.cpp/pull/1681
download: Fix large q5 model name by @dimopep in https://github.com/ggerganov/whisper.cpp/pull/1695
sync : ggml (VMM, sync-ggml-am.sh, dotprod ARM fixes) by @ggerganov in https://github.com/ggerganov/whisper.cpp/pull/1691
whisper : replace tensor->n_dims with ggml_n_dims(tensor) by @bobqianic in https://github.com/ggerganov/whisper.cpp/pull/1694
Build with CLBlast by @tamo in https://github.com/ggerganov/whisper.cpp/pull/1576
docker : Fix the Publishing of the CUDA Docker Image by @bobqianic in https://github.com/ggerganov/whisper.cpp/pull/1704
emscripten: fix "Stack Overflow!" by @Huguet57 in https://github.com/ggerganov/whisper.cpp/pull/1713
sync : ggml by @ggerganov in https://github.com/ggerganov/whisper.cpp/pull/1717
Add error handling to graph_compute by @finnvoor in https://github.com/ggerganov/whisper.cpp/pull/1714
Updates Package.swift to use ggml as package dependency by @1-ashraful-islam in https://github.com/ggerganov/whisper.cpp/pull/1701

New Contributors

@ForkedInTime made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/1655
@OpenWaygate made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/1674
@dimopep made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/1695
@Huguet57 made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/1713
@1-ashraful-islam made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/1701

Full Changelog: https://github.com/ggerganov/whisper.cpp/compare/v1.5.2...v1.5.3

v1.5.2

4 months ago

Overview

Minor maintenance release:

Re-enable CPU BLAS processing after fixing a regression (#1583)

Add new example: wchess

https://github.com/ggerganov/whisper.cpp/assets/1991296/c2b2f03c-9684-49f3-8106-357d2d4e67fa

Shoutout to @fraxy-v (implementation) and @ejones (grammar) for making it work!

What's Changed

automatically convert audio on the server by @sapoepsilon in https://github.com/ggerganov/whisper.cpp/pull/1539
CI : Rectify the Clang-Related workflow issues by @bobqianic in https://github.com/ggerganov/whisper.cpp/pull/1551
CI : Add CUDA 11.8.0 support by @bobqianic in https://github.com/ggerganov/whisper.cpp/pull/1554
Update main program help info by @bebound in https://github.com/ggerganov/whisper.cpp/pull/1560
Set default CORS headers to allow all by @kasumi-1 in https://github.com/ggerganov/whisper.cpp/pull/1567
cmake : install required ggml.h header by @gjasny in https://github.com/ggerganov/whisper.cpp/pull/1568
Backport .srt output format to examples/server by @osdrv in https://github.com/ggerganov/whisper.cpp/pull/1565
Added support for .vtt format to Whisper server by @aleksanderandrzejewski in https://github.com/ggerganov/whisper.cpp/pull/1578
ggml : re-enable blas for src0 != F32 by @ggerganov in https://github.com/ggerganov/whisper.cpp/pull/1583
Fix 32-bit compiler warning by @Digipom in https://github.com/ggerganov/whisper.cpp/pull/1575
Remove #if arch(arm) check in Swift Package Manager by @Finnvoor in https://github.com/ggerganov/whisper.cpp/pull/1561
Pass max-len argument to server wparams by @osdrv in https://github.com/ggerganov/whisper.cpp/pull/1574
sync : ggml (new ops, new backend, etc) by @ggerganov in https://github.com/ggerganov/whisper.cpp/pull/1602
Fix ggml_metal_log on Intel macs by @Finnvoor in https://github.com/ggerganov/whisper.cpp/pull/1606
Update CMakeLists.txt by @Kreijstal in https://github.com/ggerganov/whisper.cpp/pull/1615
target windows 8 or above for prefetchVirtualMemory in llama-talk by @Kreijstal in https://github.com/ggerganov/whisper.cpp/pull/1617
sync : ggml (Metal fixes, new ops, tests) by @ggerganov in https://github.com/ggerganov/whisper.cpp/pull/1633
wchess: whisper assisted chess by @fraxy-v in https://github.com/ggerganov/whisper.cpp/pull/1595

New Contributors

@sapoepsilon made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/1539
@bebound made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/1560
@kasumi-1 made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/1567
@gjasny made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/1568
@osdrv made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/1565
@aleksanderandrzejewski made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/1578
@Kreijstal made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/1615
@fraxy-v made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/1595

Full Changelog: https://github.com/ggerganov/whisper.cpp/compare/v1.5.1...v1.5.2

v1.5.1

5 months ago

Overview

Minor update:

With Metal, auto-fallback to CPU if device does not support Apple7 family
Add server example

What's Changed

ISSUE-1329: replace " with ' so it doesn't try to execute code in backticks by @spullara in https://github.com/ggerganov/whisper.cpp/pull/1364
sync : ggml (ggml-alloc + linker + gguf fixes) by @ggerganov in https://github.com/ggerganov/whisper.cpp/pull/1501
Fixed with_state methods, to use the correct state by @sandrohanea in https://github.com/ggerganov/whisper.cpp/pull/1519
#1517 Redistribute CUDA DLLs by @tamo in https://github.com/ggerganov/whisper.cpp/pull/1522
whisper : reuse whisper_decode_with_state by @ggerganov in https://github.com/ggerganov/whisper.cpp/pull/1521
sdl : fix audio callback by @ggerganov in https://github.com/ggerganov/whisper.cpp/pull/1523
update deprecated example by @MightyStud in https://github.com/ggerganov/whisper.cpp/pull/1529
Super Simple Whisper Server by @felrock in https://github.com/ggerganov/whisper.cpp/pull/1380
Close file after writing in server application by @felrock in https://github.com/ggerganov/whisper.cpp/pull/1533
bench : multi-thread memcpy by @ggerganov in https://github.com/ggerganov/whisper.cpp/pull/1534
Change temp file name for server application by @felrock in https://github.com/ggerganov/whisper.cpp/pull/1535
Fixed Makefile for MacOS ARM 64 Go bindings by @gleicon in https://github.com/ggerganov/whisper.cpp/pull/1530
Fixed metal build on macos-latest by @sandrohanea in https://github.com/ggerganov/whisper.cpp/pull/1544
fix(server): typo in temperature parameter by @Okabintaro in https://github.com/ggerganov/whisper.cpp/pull/1545
Request to add a new function to get the full language name by @bradmit in https://github.com/ggerganov/whisper.cpp/pull/1546
server : add --print-realtime param by @ecneladis in https://github.com/ggerganov/whisper.cpp/pull/1541
cuda : sync some minor stuff from llama.cpp by @ggerganov in https://github.com/ggerganov/whisper.cpp/pull/1548
metal : add backend function to check device family support by @ggerganov in https://github.com/ggerganov/whisper.cpp/pull/1547

New Contributors

@spullara made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/1364
@MightyStud made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/1529
@felrock made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/1380
@gleicon made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/1530
@Okabintaro made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/1545
@bradmit made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/1546
@ecneladis made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/1541

Full Changelog: https://github.com/ggerganov/whisper.cpp/compare/v1.5.0...v1.5.1

v1.5.0

5 months ago

Overview

This major release includes the following changes:

Full GPU processing of the Encoder and the Decoder with CUDA and Metal is now supported
Efficient beam-search implementation via batched decoding and unified KV cache
Full quantization support of all available ggml quantization types
Support for grammar constrained sampling
Support for Distil Whisper models
Support for Whisper Large-v3

and more

Full GPU support

On Apple Silicon, GPU support has been available to a large extend since 15 Sep. However, part of the Encoder was still being executed on the CPU due to lack of MSL kernels for the convolution operations. These kernels are now available resulting in additional speed-up of the Encoder in this release:

Encoder performance on Apple M1 Max - before and after (plot by @dreness)

For NVIDIA hardware, the entire computation can now be offloaded to the GPU which results in significant performance boost. For detailed performance breakdown, checkout the Benchmarks section below.

The GPU processing on Apple Silicon is enabled by default, while for NVIDIA you need to build with WHISPER_CUBLAS=1:

# Apple Silicon
make

# NVIDIA
WHISPER_CUBLAS=1 make

Implementation: https://github.com/ggerganov/whisper.cpp/pull/1472

Special credits to: @FSSRepo, @slaren

Batched decoding + efficient Beam Search

At last, whisper.cpp now supports efficient Beam Search decoding. The missing piece was the implementation of batched decoding, which now follows closely the unified KV cache idea from llama.cpp. On modern NVIDIA hardware, the performance with 5 beams is the same as 1 beam thanks to the large amount of computing power available. With Metal, the speed with 5 beams is a bit slower compared to 1 beam, but it is significantly faster compared to 5x times the time for single batch which was observed with the old naive implementation.

Beam Search is now enabled by default in whisper.cpp to match the OG implementation of OpenAI Whisper. For more performance details, checkout the Benchmarks section below.

Implementation: https://github.com/ggerganov/whisper.cpp/pull/1486

Quantization support

All ggml quantization types are now supported. Quantization mixtures for Whisper model can be implemented. It's still unclear how the quality is affected from the quantization - this is an interesting area which can be explored in the future.

Grammar sampling

The decoder output can now be constrained with a GBNF grammar. This can be a useful technique for further improving the transcription quality in situations where the set of possible phrases are limited.

https://github.com/ggerganov/whisper.cpp/assets/377495/d24716e2-5e9c-441b-8c6b-395922dccbf4

Implementation: https://github.com/ggerganov/whisper.cpp/pull/1229

Special credits to @ejones

Distil Whisper

Recently, Distil Whisper models have been released: https://huggingface.co/distil-whisper

whisper.cpp offers support for these models, although it still lacks full implementation of the proposed chunking strategy. Performance details for distilled models are included in the Benchmarks section below.

Implementation: https://github.com/ggerganov/whisper.cpp/pull/1424

Whisper Large-v3

Recently, OpenAI released a new version 3 of the Large model: https://github.com/openai/whisper/pull/1761

Implementation: https://github.com/ggerganov/whisper.cpp/pull/1444

Benchmarks

Below is a breakdown of the performance of whisper.cpp on Apple Silicon, NVIDIA and CPU. The tables show the Encoder and Decoder speed in ms/tok. The Dec. column corresponds to batch size 1. The Bch5 column corresponds to batch size 5. The PP column corresponds to batch size 128.

For optimal Beam Search performance, the Bch5 number should be 5 times smaller than Dec.

Hw	Config	Model	Th	Enc.	Dec.	Bch5	PP	Commit
M2 Ultra	METAL	tiny	1	11.14	1.40	0.49	0.01	ccc85b4
M2 Ultra	METAL	tiny-q5_0	1	11.51	1.41	0.52	0.01	ccc85b4
M2 Ultra	METAL	tiny-q5_1	1	12.21	1.41	0.52	0.01	ccc85b4
M2 Ultra	METAL	base	1	20.21	2.05	0.77	0.02	ccc85b4
M2 Ultra	METAL	base-q5_0	1	19.89	1.96	0.81	0.02	ccc85b4
M2 Ultra	METAL	base-q5_1	1	20.14	2.02	0.81	0.02	ccc85b4
M2 Ultra	METAL	small	1	51.01	3.97	1.74	0.05	ccc85b4
M2 Ultra	METAL	small-q5_0	1	56.86	4.09	1.85	0.06	ccc85b4
M2 Ultra	METAL	small-q5_1	1	56.81	4.14	1.85	0.06	ccc85b4
M2 Ultra	METAL	medium	1	141.21	8.47	3.98	0.13	ccc85b4
M2 Ultra	METAL	medium-q5_0	1	160.56	8.27	4.18	0.14	ccc85b4
M2 Ultra	METAL	medium-q5_1	1	160.52	8.40	4.15	0.14	ccc85b4
M2 Ultra	METAL	medium-dis	1	128.14	1.13	0.43	0.02	ccc85b4
M2 Ultra	METAL	large-v2	1	248.73	11.96	6.08	0.22	ccc85b4
M2 Ultra	METAL	large-v2-q5_0	1	286.31	11.99	6.60	0.26	ccc85b4
M2 Ultra	METAL	large-v2-q5_1	1	284.56	12.42	6.47	0.26	ccc85b4
M2 Ultra	METAL	large-v2-dis	1	224.31	1.26	0.49	0.02	ccc85b4

Hw	Config	Model	Th	Enc.	Dec.	Bch5	PP	Commit
M2 Ultra	COREML METAL	tiny	1	7.60	1.41	0.50	0.01	ccc85b4
M2 Ultra	COREML METAL	base	1	11.90	2.07	0.78	0.02	ccc85b4
M2 Ultra	COREML METAL	small	1	32.19	4.10	1.78	0.05	ccc85b4
M2 Ultra	COREML METAL	medium	1	94.43	8.40	3.89	0.12	ccc85b4
M2 Ultra	COREML METAL	large-v2	1	179.78	12.12	6.07	0.22	ccc85b4

Hw	Config	Model	Th	Enc.	Dec.	Bch5	PP	Commit
NVIDIA V100	BLAS CUDA	tiny	1	8.84	1.62	0.33	0.02	ccc85b4
NVIDIA V100	BLAS CUDA	tiny-q5_0	1	8.43	1.19	0.31	0.02	ccc85b4
NVIDIA V100	BLAS CUDA	tiny-q5_1	1	8.41	1.19	0.29	0.02	ccc85b4
NVIDIA V100	BLAS CUDA	base	1	14.79	2.31	0.46	0.03	ccc85b4
NVIDIA V100	BLAS CUDA	base-q5_0	1	15.05	1.66	0.44	0.03	ccc85b4
NVIDIA V100	BLAS CUDA	base-q5_1	1	15.01	1.68	0.46	0.03	ccc85b4
NVIDIA V100	BLAS CUDA	small	1	40.30	4.37	0.88	0.05	ccc85b4
NVIDIA V100	BLAS CUDA	small-q5_0	1	41.17	3.11	0.94	0.05	ccc85b4
NVIDIA V100	BLAS CUDA	small-q5_1	1	41.12	3.11	0.82	0.05	ccc85b4
NVIDIA V100	BLAS CUDA	medium	1	104.93	10.06	1.77	0.11	ccc85b4
NVIDIA V100	BLAS CUDA	medium-q5_0	1	107.11	6.13	2.07	0.12	ccc85b4
NVIDIA V100	BLAS CUDA	medium-q5_1	1	107.91	6.21	1.77	0.12	ccc85b4
NVIDIA V100	BLAS CUDA	medium-dis	1	103.45	1.11	0.24	0.02	ccc85b4
NVIDIA V100	BLAS CUDA	large-v2	1	171.55	15.76	2.62	0.17	ccc85b4
NVIDIA V100	BLAS CUDA	large-v2-q5_0	1	176.27	8.61	3.17	0.19	ccc85b4
NVIDIA V100	BLAS CUDA	large-v2-q5_1	1	176.23	8.67	2.59	0.19	ccc85b4

Hw	Config	Model	Th	Enc.	Dec.	Bch5	PP	Commit
AMD Ryzen 9 5950X	AVX2	tiny	8	197.47	1.22	0.44	0.25	ccc85b4
AMD Ryzen 9 5950X	AVX2	tiny-q5_0	8	222.92	0.87	0.45	0.30	ccc85b4
AMD Ryzen 9 5950X	AVX2	tiny-q5_1	8	221.25	0.89	0.45	0.30	ccc85b4
AMD Ryzen 9 5950X	AVX2	base	8	427.14	3.11	0.88	0.43	ccc85b4
AMD Ryzen 9 5950X	AVX2	base-q5_0	8	474.96	1.41	0.72	0.51	ccc85b4
AMD Ryzen 9 5950X	AVX2	base-q5_1	8	485.05	1.48	0.73	0.52	ccc85b4
AMD Ryzen 9 5950X	AVX2	small	8	1470.51	11.70	2.89	1.21	ccc85b4
AMD Ryzen 9 5950X	AVX2	small-q5_0	8	1700.43	5.48	1.98	1.41	ccc85b4
AMD Ryzen 9 5950X	AVX2	small-q5_1	8	1719.03	5.79	2.02	1.42	ccc85b4
AMD Ryzen 9 5950X	AVX2	medium	8	4417.70	35.13	8.14	3.24	ccc85b4
AMD Ryzen 9 5950X	AVX2	medium-q5_0	8	5335.77	17.44	5.35	3.92	ccc85b4
AMD Ryzen 9 5950X	AVX2	medium-q5_1	8	5372.26	18.36	5.42	3.88	ccc85b4
AMD Ryzen 9 5950X	AVX2	medium-dis	8	4070.25	4.86	1.16	0.53	ccc85b4
AMD Ryzen 9 5950X	AVX2	large-v2	8	8179.09	66.89	15.45	5.88	ccc85b4
AMD Ryzen 9 5950X	AVX2	large-v2-dis	8	7490.45	7.06	1.63	0.70	ccc85b4

API Changes

Add struct whisper_context_params
Add whisper_log_set
Deprecate:
- whisper_init_from_file
- whisper_init_from_buffer
- whisper_init
- whisper_init_from_file_no_state
- whisper_init_from_buffer_no_state
- whisper_init_no_state
Add:
- whisper_init_from_file_with_params
- whisper_init_from_buffer_with_params
- whisper_init_with_params
- whisper_init_from_file_with_params_no_state
- whisper_init_from_buffer_with_params_no_state
- whisper_init_with_params_no_state
Diff of struct whisper_full_params

     struct whisper_full_params {
         enum whisper_sampling_strategy strategy;
@@ -338,6 +435,7 @@ extern "C" {
 
         bool translate;
         bool no_context;        // do not use past transcription (if any) as initial prompt for the decoder
+        bool no_timestamps;     // do not generate timestamps
         bool single_segment;    // force single segment output (useful for streaming)
         bool print_special;     // print special tokens (e.g. <SOT>, <EOT>, <BEG>, etc.)
         bool print_progress;    // print progress information
@@ -355,8 +453,12 @@ extern "C" {
         // [EXPERIMENTAL] speed-up techniques
         // note: these can significantly reduce the quality of the output
         bool speed_up;          // speed-up the audio by 2x using Phase Vocoder
+        bool debug_mode;        // enable debug_mode provides extra info (eg. Dump log_mel)
         int  audio_ctx;         // overwrite the audio context size (0 = use default)
 
+        // [EXPERIMENTAL] [TDRZ] tinydiarize
+        bool tdrz_enable;       // enable tinydiarize speaker turn detection
+
         // tokens to provide to the whisper decoder as initial prompt
         // these are prepended to any existing text context from a previous call
         const char * initial_prompt;
@@ -365,6 +467,7 @@ extern "C" {
 
         // for auto-detection, set to nullptr, "" or "auto"
         const char * language;
+        bool detect_language;
 
         // common decoding parameters:
         bool suppress_blank;    // ref: https://github.com/openai/whisper/blob/f82bc59f5ea234d4b97fb2860842ed38519f7e65/whisper/decoding.py#L89
@@ -403,11 +506,24 @@ extern "C" {
         whisper_encoder_begin_callback encoder_begin_callback;
         void * encoder_begin_callback_user_data;
 
+        // called each time before ggml computation starts
+        whisper_abort_callback abort_callback;
+        void * abort_callback_user_data;
+
         // called by each decoder to filter obtained logits
         whisper_logits_filter_callback logits_filter_callback;
         void * logits_filter_callback_user_data;
+
+        const whisper_grammar_element ** grammar_rules;
+        size_t                           n_grammar_rules;
+        size_t                           i_start_rule;
+        float                            grammar_penalty;
     };

There might be some instability around the API, especially with the existing language bindings. I wasn't able to test everything, so expect some issues and feel free to submit PRs with any kind of fixes that you find.

Highlights and what's next

A lot of the updates in these release are possible thanks to the many contributions in llama.cpp - huge shoutout to all the contributors and collaborators there!

Regarding future updates to whisper.cpp, I'm looking forward to the following things:

Add server example similar to the one in llama.cpp
Try to improve Metal's batched decoding performance
Look for some interesting applications of the grammar sampling functionality

Latest performance of the talk-llama example

https://github.com/ggerganov/whisper.cpp/assets/1991296/d97a3788-bf2a-4756-9a43-60c6b391649e

What's Changed

Fix quantize bug by @ggerganov in https://github.com/ggerganov/whisper.cpp/pull/842
whisper.wasm : fix typo in readme by @BaffinLee in https://github.com/ggerganov/whisper.cpp/pull/832
Adding --session support in examples/talk-llama by @herrera-luis in https://github.com/ggerganov/whisper.cpp/pull/845
--detect-language mode by @CRD716 in https://github.com/ggerganov/whisper.cpp/pull/853
talk-llama: updating session prompts load by @herrera-luis in https://github.com/ggerganov/whisper.cpp/pull/854
CMake/Makefile : CLBlast support as in llama.cpp by @trholding in https://github.com/ggerganov/whisper.cpp/pull/862
Instruction: Partial OpenCL GPU support via CLBlast by @trholding in https://github.com/ggerganov/whisper.cpp/pull/863
Add cuBLAS build workflow and fix error causing lines in CMakeLists by @RelatedTitle in https://github.com/ggerganov/whisper.cpp/pull/867
cmake : fix options disabling AVX and AVX2 flags by @blazingzephyr in https://github.com/ggerganov/whisper.cpp/pull/885
Added large-v2. Added instructions on converting to GGML. Added --no-… by @cjheath in https://github.com/ggerganov/whisper.cpp/pull/874
talk-llama: only copy used KV cache in get / set state by @herrera-luis in https://github.com/ggerganov/whisper.cpp/pull/890
Fix define used for COREML_ALLOW_FALLBACK by @jcsoo in https://github.com/ggerganov/whisper.cpp/pull/893
coreml : fix memory leak by @ggerganov in https://github.com/ggerganov/whisper.cpp/pull/899
whisper.objc : enable Core ML in example & fix segmentation fault by @jhen0409 in https://github.com/ggerganov/whisper.cpp/pull/910
Align --no-timestamps in help to actual behavior by @Miserlou in https://github.com/ggerganov/whisper.cpp/pull/908
readme : improve Core ML model conversion guidance by @jhen0409 in https://github.com/ggerganov/whisper.cpp/pull/915
Added support of large-v1 model into CoreML by @abCods in https://github.com/ggerganov/whisper.cpp/pull/926
Update of Hebrew Language Code: 'iw' to 'he' by @ttv20 in https://github.com/ggerganov/whisper.cpp/pull/935
java bindings by @nalbion in https://github.com/ggerganov/whisper.cpp/pull/931
ci: Build with any BLAS compatible library by @akharlamov in https://github.com/ggerganov/whisper.cpp/pull/927
[DOCS] highlight openblas support in https://github.com/ggerganov/whisper.cpp/pull/956
Update elevenlabs example to use official python API by @DGdev91 in https://github.com/ggerganov/whisper.cpp/pull/837
Update README.md by @genevera in https://github.com/ggerganov/whisper.cpp/pull/964
Feature/java bindings2 by @nalbion in https://github.com/ggerganov/whisper.cpp/pull/944
Support decode wav file has 2 channels. by @geniusnut in https://github.com/ggerganov/whisper.cpp/pull/972
README.md: Corrected syntax for markdown link by @LarryBattle in https://github.com/ggerganov/whisper.cpp/pull/995
Make convert-pt-to-ggml.py backwards compatible with older vocab.json tokenizer files by @akashmjn in https://github.com/ggerganov/whisper.cpp/pull/1001
Fixing Accidental 'exit(0)' and Ensuring Proper 'return 1' in examples/main/main.cpp whisper_params_parse by @faker2048 in https://github.com/ggerganov/whisper.cpp/pull/1002
Fix for issue #876 by @burningion in https://github.com/ggerganov/whisper.cpp/pull/1012
Make cuBLAS compilation compatible with x86 as well as aarch64 by @byte-6174 in https://github.com/ggerganov/whisper.cpp/pull/1015
feat(golang): improve progress reporting and callback handling by @appleboy in https://github.com/ggerganov/whisper.cpp/pull/1024
Add support for whisper_full_lang_id() to go bindings by @jaybinks in https://github.com/ggerganov/whisper.cpp/pull/1010
Add alternative java binding to readme by @GiviMAD in https://github.com/ggerganov/whisper.cpp/pull/1029
diarization: add diarization support for all current output types by @colinc in https://github.com/ggerganov/whisper.cpp/pull/1031
Fix cd statements to allow spaces in model path by @roddurd in https://github.com/ggerganov/whisper.cpp/pull/1041
adding ggml_to_pt script by @simonMoisselin in https://github.com/ggerganov/whisper.cpp/pull/1042
whisper: Fix build with -Werror=undef by @philn in https://github.com/ggerganov/whisper.cpp/pull/1045
Fix talk-llama build after ggml sync (commit 5feb0dffbae5). by @przemoc in https://github.com/ggerganov/whisper.cpp/pull/1049
Do not use _GNU_SOURCE gratuitously. by @przemoc in https://github.com/ggerganov/whisper.cpp/pull/1027
whisper : split_on_word no longer trims by @ggerganov in https://github.com/ggerganov/whisper.cpp/pull/1046
Updated 'quantize-all.sh' to quantize all downloaded models by @thefinaldegree in https://github.com/ggerganov/whisper.cpp/pull/1054
Fix talk-llama build on macOS. by @przemoc in https://github.com/ggerganov/whisper.cpp/pull/1062
whisper : support speaker segmentation (local diarization) of mono audio via tinydiarize by @akashmjn in https://github.com/ggerganov/whisper.cpp/pull/1058
Minor: updated readme by @mwarnaar in https://github.com/ggerganov/whisper.cpp/pull/1064
OpenVINO support by @RyanMetcalfeInt8 in https://github.com/ggerganov/whisper.cpp/pull/1037
go bindings: fix context.Process call in examples by @mvrilo in https://github.com/ggerganov/whisper.cpp/pull/1067
go: Call SetDuration appropriately by @tmc in https://github.com/ggerganov/whisper.cpp/pull/1077
Multi platforms CI by @alonfaraj in https://github.com/ggerganov/whisper.cpp/pull/1101
Add Vim plugin by @AustinMroz in https://github.com/ggerganov/whisper.cpp/pull/1131
chore: move progress calculation out of whisper.cpp by @geekodour in https://github.com/ggerganov/whisper.cpp/pull/1081
expose api to let user control log output by @evmar in https://github.com/ggerganov/whisper.cpp/pull/1060
Add a larger (30min) sample by @vadi2 in https://github.com/ggerganov/whisper.cpp/pull/1092
Sync opencl compilation fix in ggml by @goncha in https://github.com/ggerganov/whisper.cpp/pull/1111
README.md: Add OpenVINO support details by @RyanMetcalfeInt8 in https://github.com/ggerganov/whisper.cpp/pull/1112
Fix MSVC compile error C3688 on non-unicode Windows by @goncha in https://github.com/ggerganov/whisper.cpp/pull/1110
Now make tests can be called as make tests base.en by @Jerry-Master in https://github.com/ggerganov/whisper.cpp/pull/1113
Go binding: Implement SetSplitOnWord by @xdrudis in https://github.com/ggerganov/whisper.cpp/pull/1114
set NVCC -arch flag by cuda version by @alonfaraj in https://github.com/ggerganov/whisper.cpp/pull/1115
Fix CLBlast build on MacOS by @iceychris in https://github.com/ggerganov/whisper.cpp/pull/1120
Fixed the issue of OpenBLAS not being enabled on Windows. by @bobqianic in https://github.com/ggerganov/whisper.cpp/pull/1128
whisper : fix visibility warning of struct whisper_full_params by declaring in advance by @IronBlood in https://github.com/ggerganov/whisper.cpp/pull/1124
Fix MSVC compile error C3688 by @bobqianic in https://github.com/ggerganov/whisper.cpp/pull/1136
Add tinydiarization support for streaming by @DMcConnell in https://github.com/ggerganov/whisper.cpp/pull/1137
quantize : fix load vocab crash when len is 128 by @jhen0409 in https://github.com/ggerganov/whisper.cpp/pull/1160
Fix AVX etc. under GCC/CMake by @marmistrz in https://github.com/ggerganov/whisper.cpp/pull/1174
Fix PowerPC build failures introduced in #1174 by @marmistrz in https://github.com/ggerganov/whisper.cpp/pull/1196
Simplify Makefile by @alonfaraj in https://github.com/ggerganov/whisper.cpp/pull/1147
Add precalculated values of sin/cos for speeding up FFT by @AlexandrGraschenkov in https://github.com/ggerganov/whisper.cpp/pull/1142
Make build work on Linux machines supporting AVX1 not AVX2 by @lachesis in https://github.com/ggerganov/whisper.cpp/pull/1162
Fix OpenBLAS detection under Arch Linux by @marmistrz in https://github.com/ggerganov/whisper.cpp/pull/1173
Minor fixes by @csukuangfj in https://github.com/ggerganov/whisper.cpp/pull/1154
New command line option by @jbyunes in https://github.com/ggerganov/whisper.cpp/pull/1205
whisper.android : migrate from ndk-build to CMake by @JunkFood02 in https://github.com/ggerganov/whisper.cpp/pull/1204
Significantly improve whisper.cpp inference quality by @bobqianic in https://github.com/ggerganov/whisper.cpp/pull/1148
whisper : allow whisper_full from mel spectrogram - no audio by @ggerganov in https://github.com/ggerganov/whisper.cpp/pull/1214
ROCm Port by @ardfork in https://github.com/ggerganov/whisper.cpp/pull/1209
Improvements to vim plugin and LSP server by @AustinMroz in https://github.com/ggerganov/whisper.cpp/pull/1144
Detect SSSE3 by @przemoc in https://github.com/ggerganov/whisper.cpp/pull/1211
ggml : fix compiling when SSE3 is available but not SSSE3 by @przemoc in https://github.com/ggerganov/whisper.cpp/pull/1210
make : add support for building on DragonFlyBSD/NetBSD/OpenBSD by @przemoc in https://github.com/ggerganov/whisper.cpp/pull/1212
make : use cpuinfo in MSYS2 to enable x86 ISA extensions on the host by @przemoc in https://github.com/ggerganov/whisper.cpp/pull/1216
Fix CoreML memleak (fixes #1202) by @denersc in https://github.com/ggerganov/whisper.cpp/pull/1218
whisper.android : fix cmake multiple libraries build by @jhen0409 in https://github.com/ggerganov/whisper.cpp/pull/1224
Fix compilation errors incurred by -Werror by @shivamidow in https://github.com/ggerganov/whisper.cpp/pull/1227
ci : enable java package publishing by @ggerganov in https://github.com/ggerganov/whisper.cpp/pull/1228
fix cmake commands in README #1225 by @wizardforcel in https://github.com/ggerganov/whisper.cpp/pull/1231
ggml : sync (ggml-alloc, GPU, eps, etc.) by @ggerganov in https://github.com/ggerganov/whisper.cpp/pull/1220
make : improve cpuinfo handling on x86 hosts by @przemoc in https://github.com/ggerganov/whisper.cpp/pull/1238
ggml : sync latest llama.cpp (view_src + alloc improvements) by @ggerganov in https://github.com/ggerganov/whisper.cpp/pull/1247
Posixify pagesize. by @przemoc in https://github.com/ggerganov/whisper.cpp/pull/1251
Fix detection of AVX2 on macOS by @didzis in https://github.com/ggerganov/whisper.cpp/pull/1250
Address ARM's big.LITTLE arch by checking cpu info. by @Digipom in https://github.com/ggerganov/whisper.cpp/pull/1254
Bump gradle plugin and dependencies + a lint pass by @Digipom in https://github.com/ggerganov/whisper.cpp/pull/1255
Add quantized models to download-ggml-model.sh by @nchudleigh in https://github.com/ggerganov/whisper.cpp/pull/1235
Do not use _GNU_SOURCE gratuitously. by @przemoc in https://github.com/ggerganov/whisper.cpp/pull/1129
ci : upgrade gradle to 2.4.2 by @ggerganov in https://github.com/ggerganov/whisper.cpp/pull/1263
sync : ggml (HBM + Metal + style) by @ggerganov in https://github.com/ggerganov/whisper.cpp/pull/1264
ci : try to fix gradle action by @ggerganov in https://github.com/ggerganov/whisper.cpp/pull/1265
Fixed signing of java artifact using gradle by @nalbion in https://github.com/ggerganov/whisper.cpp/pull/1267
Faster beam_search sampling by @bobqianic in https://github.com/ggerganov/whisper.cpp/pull/1243
whisper : fix bench regression by @ggerganov in https://github.com/ggerganov/whisper.cpp/pull/1275
whisper : Metal and ggml-alloc support by @ggerganov in https://github.com/ggerganov/whisper.cpp/pull/1270
bench: fix missing include by @nekr0z in https://github.com/ggerganov/whisper.cpp/pull/1303
ruby : fix build by add missing ggml-alloc by @jhen0409 in https://github.com/ggerganov/whisper.cpp/pull/1305
Update README.md. Adding missing options, remove --speed-up. by @Sogl in https://github.com/ggerganov/whisper.cpp/pull/1306
Update README.md by @computerscienceiscool in https://github.com/ggerganov/whisper.cpp/pull/1290
save the recorded audio to a file by @litongjava in https://github.com/ggerganov/whisper.cpp/pull/1310
Python benchmark script by @nchudleigh in https://github.com/ggerganov/whisper.cpp/pull/1298
Minor: fix example talk readme gpt-2 github url by @brunofaustino in https://github.com/ggerganov/whisper.cpp/pull/1334
Missing speaker turn function in API by @didzis in https://github.com/ggerganov/whisper.cpp/pull/1330
examples: Move wav_writer from stream.cpp to common.h by @bobqianic in https://github.com/ggerganov/whisper.cpp/pull/1317
Better abort callback by @mkiol in https://github.com/ggerganov/whisper.cpp/pull/1335
Add conversion scripts from HuggingFace models to CoreML by @AlienKevin in https://github.com/ggerganov/whisper.cpp/pull/1304
Prefer pkg-config while looking for BLAS by @marmistrz in https://github.com/ggerganov/whisper.cpp/pull/1349
Abort build if a feature was requested and could not be configured by @marmistrz in https://github.com/ggerganov/whisper.cpp/pull/1350
Abort callback improvements by @mkiol in https://github.com/ggerganov/whisper.cpp/pull/1345
Dockerfile for cublas by @joecryptotoo in https://github.com/ggerganov/whisper.cpp/pull/1286
docs: fix typo by @jorismertz in https://github.com/ggerganov/whisper.cpp/pull/1362
Expose the audio_ctx param through the Go binding by @JohanRaffin in https://github.com/ggerganov/whisper.cpp/pull/1368
Clarify doc about where to compile from by @ai-at-home in https://github.com/ggerganov/whisper.cpp/pull/1400
Faster download for models on windows using BitTransfer by @WhiteOlivierus in https://github.com/ggerganov/whisper.cpp/pull/1404
JSON: allow outputting per-token data too by @akx in https://github.com/ggerganov/whisper.cpp/pull/1358
Move up-to-date demo to top by @asadm in https://github.com/ggerganov/whisper.cpp/pull/1417
Use absolute paths for the converted OpenVINO model by @bobqianic in https://github.com/ggerganov/whisper.cpp/pull/1356
sync : ggml (backend v2, k-quants, CUDA opts, Metal opts, etc.) by @ggerganov in https://github.com/ggerganov/whisper.cpp/pull/1422
whisper : add support for new distilled Whisper models by @ggerganov in https://github.com/ggerganov/whisper.cpp/pull/1424
whisper : add context param for disable gpu by @jhen0409 in https://github.com/ggerganov/whisper.cpp/pull/1293
talk-llama : fix n_gpu_layers usage by @jhen0409 in https://github.com/ggerganov/whisper.cpp/pull/1441
talk-llama : fix n_gpu_layers usage again by @jhen0409 in https://github.com/ggerganov/whisper.cpp/pull/1442
Fix variable names in GitHub actions config by @iamthad in https://github.com/ggerganov/whisper.cpp/pull/1440
Reset ctx->t_start_us when calling whisper_reset_timings() by @bjnortier in https://github.com/ggerganov/whisper.cpp/pull/1434
Decouple Android example into a library and app module by @tobrun in https://github.com/ggerganov/whisper.cpp/pull/1445
whisper : add support for large v3 by @ggerganov in https://github.com/ggerganov/whisper.cpp/pull/1444
Add support for Swift Package Manager by @sindresorhus in https://github.com/ggerganov/whisper.cpp/pull/1370
Reset mel time when resetting timings by @bjnortier in https://github.com/ggerganov/whisper.cpp/pull/1452
coreml: use the correct n_mel by @jxy in https://github.com/ggerganov/whisper.cpp/pull/1458
models : Fix n_mel mismatch in convert-whisper-to-openvino.py by @bobqianic in https://github.com/ggerganov/whisper.cpp/pull/1459
Add '-l auto' to talk-llama example by @kubaracek in https://github.com/ggerganov/whisper.cpp/pull/1467
Return with error from whisper_encode_internal and whisper_decode_int… by @bjnortier in https://github.com/ggerganov/whisper.cpp/pull/1456
whisper : add full CUDA and Metal offloading by @ggerganov in https://github.com/ggerganov/whisper.cpp/pull/1472
examples : Enhanced compatibility with older Android versions using Java by @litongjava in https://github.com/ggerganov/whisper.cpp/pull/1382
Add n_gpu_layers option to talk-llama example by @rlapray in https://github.com/ggerganov/whisper.cpp/pull/1475
whisper : add grammar-based sampling by @ejones in https://github.com/ggerganov/whisper.cpp/pull/1229
java : use tiny.en for tests by @ggerganov in https://github.com/ggerganov/whisper.cpp/pull/1484
whisper : add batched decoding by @ggerganov in https://github.com/ggerganov/whisper.cpp/pull/1486
java : fix test by @ggerganov in https://github.com/ggerganov/whisper.cpp/pull/1492
whisper : make large version explicit + fix data size units by @ggerganov in https://github.com/ggerganov/whisper.cpp/pull/1493

New Contributors

@BaffinLee made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/832
@herrera-luis made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/845
@CRD716 made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/853
@trholding made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/862
@RelatedTitle made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/867
@blazingzephyr made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/885
@cjheath made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/874
@jcsoo made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/893
@Miserlou made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/908
@abCods made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/926
@ttv20 made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/935
@nalbion made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/931
@akharlamov made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/927
@geniusnut made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/972
@LarryBattle made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/995
@akashmjn made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/1001
@faker2048 made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/1002
@burningion made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/1012
@byte-6174 made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/1015
@appleboy made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/1024
@jaybinks made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/1010
@GiviMAD made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/1029
@colinc made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/1031
@roddurd made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/1041
@simonMoisselin made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/1042
@philn made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/1045
@przemoc made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/1049
@thefinaldegree made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/1054
@mwarnaar made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/1064
@RyanMetcalfeInt8 made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/1037
@mvrilo made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/1067
@tmc made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/1077
@alonfaraj made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/1101
@AustinMroz made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/1131
@geekodour made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/1081
@evmar made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/1060
@vadi2 made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/1092
@goncha made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/1111
@Jerry-Master made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/1113
@xdrudis made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/1114
@iceychris made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/1120
@bobqianic made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/1128
@IronBlood made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/1124
@DMcConnell made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/1137
@marmistrz made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/1174
@AlexandrGraschenkov made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/1142
@lachesis made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/1162
@csukuangfj made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/1154
@jbyunes made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/1205
@JunkFood02 made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/1204
@ardfork made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/1209
@denersc made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/1218
@shivamidow made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/1227
@wizardforcel made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/1231
@didzis made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/1250
@nchudleigh made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/1235
@nekr0z made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/1303
@Sogl made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/1306
@computerscienceiscool made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/1290
@litongjava made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/1310
@brunofaustino made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/1334
@mkiol made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/1335
@AlienKevin made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/1304
@joecryptotoo made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/1286
@jorismertz made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/1362
@JohanRaffin made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/1368
@ai-at-home made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/1400
@WhiteOlivierus made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/1404
@akx made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/1358
@asadm made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/1417
@iamthad made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/1440
@bjnortier made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/1434
@tobrun made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/1445
@sindresorhus made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/1370
@jxy made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/1458
@kubaracek made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/1467
@rlapray made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/1475

Full Changelog: https://github.com/ggerganov/whisper.cpp/compare/v1.4.0...v1.5.0

v1.4.3

5 months ago

This is a minor release, the main reason for which is that there hasn't been an official release for a few months now and some small things have accumulated on the master branch that would be nice to be upstreamed. I am planning a major v1.5.0 release with some new and long-waited functionality soon:

Full CUDA offloading
Efficient Beam-Search implementation
Grammar support

The current version v1.4.3 should be considered in beta as I haven't worked intensively on whisper.cpp recently and there might be some issues that made their way in the code. I'll try to polish things in the next days and prepare a stable v1.5.0 release. In the meantime, any feedback will be highly appreciated.

Detailed API changes, features and new contributor recognitions will be included in the v1.5.0 release.

v1.4.0

11 months ago

Overview

This is a new major release adding integer quantization and partial GPU (NVIDIA) support

Integer quantization

This allows the ggml Whisper models to be converted from the default 16-bit floating point weights to 4, 5 or 8 bit integer weights. The resulting quantized models are smaller in disk size and memory usage and can be processed faster on some architectures. The transcription quality is degraded to some extend - not quantified at the moment.

Supported quantization modes: Q4_0, Q4_1, Q4_2, Q5_0, Q5_1, Q8_0
Implementation details: https://github.com/ggerganov/whisper.cpp/pull/540
Usage instructions: README
All WASM examples now support Q5 quantized models: https://whisper.ggerganov.com

Here is a quantitative evaluation of the different quantization modes applied to the LLaMA and RWKV large language models. These results can give an impression about the expected quality, size and speed for quantized Whisper models:

LLaMA quantization (measured on M1 Pro)

Model	Measure	F16	Q4_0	Q4_1	Q4_2	Q5_0	Q5_1	Q8_0
7B	perplexity	5.9565	6.2103	6.1286	6.1698	6.0139	5.9934	5.9571
7B	file size	13.0G	4.0G	4.8G	4.0G	4.4G	4.8G	7.1G
7B	ms/tok @ 4th	128	56	61	84	91	95	75
7B	ms/tok @ 8th	128	47	55	48	53	59	75
7B	bits/weight	16.0	5.0	6.0	5.0	5.5	6.0	9.0
13B	perplexity	5.2455	5.3748	5.3471	5.3433	5.2768	5.2582	5.2458
13B	file size	25.0G	7.6G	9.1G	7.6G	8.4G	9.1G	14G
13B	ms/tok @ 4th	239	104	113	160	176	185	141
13B	ms/tok @ 8th	240	85	99	97	108	117	147
13B	bits/weight	16.0	5.0	6.0	5.0	5.5	6.0	9.0

ref: https://github.com/ggerganov/llama.cpp#quantization

RWKV quantization

Format	Perplexity (169M)	Latency, ms (1.5B)	File size, GB (1.5B)
`Q4_0`	17.507	76	1.53
`Q4_1`	17.187	72	1.68
`Q4_2`	17.060	85	1.53
`Q5_0`	16.194	78	1.60
`Q5_1`	15.851	81	1.68
`Q8_0`	15.652	89	2.13
`FP16`	15.623	117	2.82
`FP32`	15.623	198	5.64

ref: https://github.com/ggerganov/ggml/issues/89#issuecomment-1528781992

This feature is possible thanks to the many contributions in the llama.cpp project: https://github.com/users/ggerganov/projects/2

GPU support via cuBLAS

Using cuBLAS results mainly in improved Encoder inference speed. I haven't done proper timings, but one can expect at least 2-3 times faster Encoder evaluation with modern NVIDIA GPU cards compared to CPU-only processing. Feel free to post your Encoder benchmarks in issue #89.

Implementation details: https://github.com/ggerganov/whisper.cpp/pull/834
Usage instructions: README

This is another feature made possible by the llama.cpp project. Special recognition to @slaren for putting almost all of this work together

This release remains in "beta" stage as I haven't verified that everything works as expected.

What's Changed

Updated escape_double_quotes() Function by @tauseefmohammed2 in https://github.com/ggerganov/whisper.cpp/pull/776
examples : add missing #include by @pH5 in https://github.com/ggerganov/whisper.cpp/pull/798
Flush upon finishing inference by @tarasglek in https://github.com/ggerganov/whisper.cpp/pull/811
Escape quotes in csv output by @laytan in https://github.com/ggerganov/whisper.cpp/pull/815
C++11style by @wuyudi in https://github.com/ggerganov/whisper.cpp/pull/768
Optionally allow a Core ML build of Whisper to work with or without Core ML models by @Canis-UK in https://github.com/ggerganov/whisper.cpp/pull/812
add some tips about in the readme of the android project folder by @Zolliner in https://github.com/ggerganov/whisper.cpp/pull/816
whisper: Use correct seek_end when offset is used by @ThijsRay in https://github.com/ggerganov/whisper.cpp/pull/833
ggml : fix 32-bit ARM NEON by @ggerganov in https://github.com/ggerganov/whisper.cpp/pull/836
Add CUDA support via cuBLAS by @ggerganov in https://github.com/ggerganov/whisper.cpp/pull/834
Integer quantisation support by @ggerganov in https://github.com/ggerganov/whisper.cpp/pull/540

New Contributors

@tauseefmohammed2 made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/776
@pH5 made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/798
@tarasglek made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/811
@laytan made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/815
@wuyudi made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/768
@Canis-UK made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/812
@Zolliner made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/816
@ThijsRay made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/833

Full Changelog: https://github.com/ggerganov/whisper.cpp/compare/v1.3.0...v1.4.0

v1.3.0

1 year ago

Overview

This release should be considered in Beta stage, since I haven't done a lot of testing and I am not sure if I didn't break something. But overall, I believe both the performance and the quality are improved.

Added Core ML support #566
Restored decoding fallbacks with default size of 2 instead of 5 (f19e23fbd108ec3ac458c7a19b31c930719e7a94)
Pad the audio with zeros instead of the spectrogram (5108b30e6daf361c856abb6b86e5038500bdbeb1)
Added talk-llama example
Added whisper_state which allows parallel transcriptions with a single model in memory (#523)

The C-style API has been extended significantly to support the new whisper_state, but in general should be backwards compatible. The only breaking change is in the callbacks signatures.

Please provide feedback in the discussion if you observe any issues.

The next release v1.4.0 will follow up relatively soon and will provide 4-bit integer quantization support.

What's Changed

update csv output format to match OpenAI's Whisper dataframe output by @hykelvinlee42 in https://github.com/ggerganov/whisper.cpp/pull/552
Go binding: NewContext now returns a clean context by @polarmoon in https://github.com/ggerganov/whisper.cpp/pull/537
Added whisper state + default state on the whisper_context by @sandrohanea in https://github.com/ggerganov/whisper.cpp/pull/523
whisper.android: Enable fp16 instrinsics (FP16_VA) which is supported by ARMv8.2 or later. by @tinoue in https://github.com/ggerganov/whisper.cpp/pull/572
Add quality comparison helper by @venkr in https://github.com/ggerganov/whisper.cpp/pull/569
whisper.android: Support benchmark for Android example. by @tinoue in https://github.com/ggerganov/whisper.cpp/pull/542
Fix MUSL Linux build by @ggerganov in https://github.com/ggerganov/whisper.cpp/pull/576
Change default encoding to UTF-8 by @Kamilake in https://github.com/ggerganov/whisper.cpp/pull/605
Provide option for creating JSON output by @tuxpoldo in https://github.com/ggerganov/whisper.cpp/pull/615
readme : add react-native bindings by @jhen0409 in https://github.com/ggerganov/whisper.cpp/pull/619
Fixed language auto-detection for state provided processing. by @sandrohanea in https://github.com/ggerganov/whisper.cpp/pull/627
xcodeproj : add -O3 -DNDEBUG in release mode by @jhen0409 in https://github.com/ggerganov/whisper.cpp/pull/640
Nodejs Addon blocking main thread. Implemented Napi::AsyncWorker by @LucasZNK in https://github.com/ggerganov/whisper.cpp/pull/642
Include link to R wrapper in README by @jwijffels in https://github.com/ggerganov/whisper.cpp/pull/626
Add a cmake flag to disable F16C by @a5huynh in https://github.com/ggerganov/whisper.cpp/pull/628
Add talk-llama example by @ggerganov in https://github.com/ggerganov/whisper.cpp/pull/664
Add Alpaca support to talk-llama example by @ejones in https://github.com/ggerganov/whisper.cpp/pull/668
Update README.md by @razodactyl in https://github.com/ggerganov/whisper.cpp/pull/682
issue #470 - working 32-bit ARM by @clach04 in https://github.com/ggerganov/whisper.cpp/pull/486
whisper : add initial_prompt param by @jhen0409 in https://github.com/ggerganov/whisper.cpp/pull/645
fix typo in JSON output by @egorFiNE in https://github.com/ggerganov/whisper.cpp/pull/648
Fix shell script ./models/download-ggml-model.sh to handle spaces and special characters in paths by @be-next in https://github.com/ggerganov/whisper.cpp/pull/677
Fixed test to new async implementation by @LucasZNK in https://github.com/ggerganov/whisper.cpp/pull/686
Minor: fixing usage message for talk-llama by @InconsolableCellist in https://github.com/ggerganov/whisper.cpp/pull/687
Small typo by @ZiggerZZ in https://github.com/ggerganov/whisper.cpp/pull/688
feat: add progress callback by @pajowu in https://github.com/ggerganov/whisper.cpp/pull/600
ggml : fix q4_1 dot product types by @novag in https://github.com/ggerganov/whisper.cpp/pull/759
Exposed various parts to the Go Interface by @bmurray in https://github.com/ggerganov/whisper.cpp/pull/697
Adds shell command example for --print-colors by @bocytko in https://github.com/ggerganov/whisper.cpp/pull/710
Makefile: disable avx in case f16c is not available by @duthils in https://github.com/ggerganov/whisper.cpp/pull/706
Making the quick start instructions clearer. by @Onlyartist9 in https://github.com/ggerganov/whisper.cpp/pull/716
Add lrc output support by @WhichWho in https://github.com/ggerganov/whisper.cpp/pull/718
Corrects default speak.sh path in talk-llama by @mab122 in https://github.com/ggerganov/whisper.cpp/pull/720
Add msvc compiler args /utf-8 fix error C3688 by @WhichWho in https://github.com/ggerganov/whisper.cpp/pull/721
Changed convert-pt-to-ggml.py to use .tiktoken tokenizer files by @ivan-gorin in https://github.com/ggerganov/whisper.cpp/pull/725
talk/talk-llama: add basic example script for eleven-labs tts by @DGdev91 in https://github.com/ggerganov/whisper.cpp/pull/728
readme : add Unity3d bindings by @Macoron in https://github.com/ggerganov/whisper.cpp/pull/733
Update stream.cpp by @AliAlameh in https://github.com/ggerganov/whisper.cpp/pull/501
Fix typos in whisper.h by @GitAritron in https://github.com/ggerganov/whisper.cpp/pull/737
Update LICENSE by @masguit42 in https://github.com/ggerganov/whisper.cpp/pull/739
fix potential memory leaks by @baderouaich in https://github.com/ggerganov/whisper.cpp/pull/740
readme: Add alternate swift bindings by @exPHAT in https://github.com/ggerganov/whisper.cpp/pull/755
Fix the bug related to word splitting errors in the "tokenize" function. by @AfryMask in https://github.com/ggerganov/whisper.cpp/pull/760
Do not launch threads for log_mel_spectrogram when singlethreaded by @maxilevi in https://github.com/ggerganov/whisper.cpp/pull/763
Core ML support by @ggerganov in https://github.com/ggerganov/whisper.cpp/pull/566
ggml : fix build on whisper.android (ARM_NEON) by @jhen0409 in https://github.com/ggerganov/whisper.cpp/pull/764

New Contributors

@hykelvinlee42 made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/552
@tinoue made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/572
@venkr made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/569
@Kamilake made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/605
@tuxpoldo made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/615
@jhen0409 made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/619
@LucasZNK made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/642
@jwijffels made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/626
@a5huynh made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/628
@ejones made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/668
@razodactyl made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/682
@clach04 made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/486
@egorFiNE made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/648
@be-next made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/677
@InconsolableCellist made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/687
@ZiggerZZ made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/688
@pajowu made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/600
@novag made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/759
@bmurray made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/697
@bocytko made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/710
@duthils made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/706
@Onlyartist9 made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/716
@WhichWho made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/718
@mab122 made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/720
@ivan-gorin made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/725
@DGdev91 made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/728
@Macoron made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/733
@AliAlameh made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/501
@GitAritron made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/737
@masguit42 made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/739
@baderouaich made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/740
@exPHAT made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/755
@AfryMask made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/760
@maxilevi made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/763

Full Changelog: https://github.com/ggerganov/whisper.cpp/compare/v1.2.1...v1.3.0

v1.2.1

1 year ago

Overview

This is a minor release. The main reason for it is a critical bug fix that causes the software to crash randomly when the language auto-detect option is used (i.e. whisper_lang_auto_detect()).

Other than that, the release includes refactoring of the examples, ruby bindings and some minor changes to the C API.

You can provide feedback in the existing v1.2.0 discussion.

What's Changed

Core `ggml` / `whisper`

whisper : whisper : add "split_on_word" flag when using using "max_len" option by @mightymatth in #455 and @boolemancer in https://github.com/ggerganov/whisper.cpp/pull/476
whisper : add whisper_full_lang_id() for getting the context lang by @kamranjon in https://github.com/ggerganov/whisper.cpp/pull/461
whisper : fixed Beam Search Strategy and exposed whisper_pcm_to_mel_phase_vocoder by @sandrohanea in https://github.com/ggerganov/whisper.cpp/pull/474
whisper : suppress non-speech-related token outputs by @shibukazu in https://github.com/ggerganov/whisper.cpp/pull/473
cmake : install whisper.h header by @aviks in https://github.com/ggerganov/whisper.cpp/pull/485
whisper : fix signedness compiler warning by @shikokuchuo in https://github.com/ggerganov/whisper.cpp/pull/506
whisper : by default disable non-speech tokens suppression #473
whisper : add API for applying custom logits filters during decoding 0d229163bbea769c7a3e0e500e45850c9a6e2e42
whisper : fix uninitialized exp_n_audio_ctx by @Finnvoor in https://github.com/ggerganov/whisper.cpp/pull/520

Bindings

bindings : add Ruby by @taf2 in https://github.com/ggerganov/whisper.cpp/pull/500
readme : add .NET repos (#303)
readme : add cython bindings (#9)
readme : add pybind11 bindings by @aarnphm in https://github.com/ggerganov/whisper.cpp/pull/538

Examples

ci : add node addon test and optimize compilation configuration by @chenqianhe in https://github.com/ggerganov/whisper.cpp/pull/468
yt-wsp.sh : add unique filename generation by @genevera in https://github.com/ggerganov/whisper.cpp/pull/495
examples : refactor in order to reuse code and reduce duplication by @ggerganov in https://github.com/ggerganov/whisper.cpp/pull/482
main : fix stdin pipe stream by @conradg in https://github.com/ggerganov/whisper.cpp/pull/503
make : add "-mcpu=native" when building for aarch64 (#532)

C-style API

Add whisper_pcm_to_mel_phase_vocoder()
Add *(whisper_logits_filter_callback)()
Change struct whisper_full_params
Add whisper_full_lang_id()

New Contributors

@mightymatth made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/455
@kamranjon made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/461
@sandrohanea made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/474
@shibukazu made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/473
@genevera made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/495
@shikokuchuo made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/506
@conradg made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/503
@taf2 made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/500
@Finnvoor made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/520
@aarnphm made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/538
@FlippFuzz made their first contribution in https://github.com/ggerganov/whisper.cpp/pull/532

Full Changelog: https://github.com/ggerganov/whisper.cpp/compare/v1.2.0...v1.2.1

Highlights

Recently, I have been making progress on adding integer quantisation support in the ggml tensor library. This will eventually allow to use quantised models which require less memory and will hopefully run faster. I think the next major release v1.3.0 will officially add quantisation support. For now, you can keep track of the progress in #540

🎙️ MacWhisper by @jordibruin powered by whisper.cpp

https://goodsnooze.gumroad.com/l/macwhisper

Whisper.cpp Versions Save

v1.5.5

Overview

What's Changed

New Contributors

v1.5.4

Overview

v1.5.3

Overview

What's Changed

New Contributors

v1.5.2

Overview

What's Changed

New Contributors

v1.5.1

Overview

What's Changed

New Contributors

v1.5.0

Overview

Full GPU support

Batched decoding + efficient Beam Search

Quantization support

Grammar sampling

Distil Whisper

Whisper Large-v3

Benchmarks

API Changes

Highlights and what's next

What's Changed

New Contributors

v1.4.3

v1.4.0

Overview

Integer quantization

LLaMA quantization (measured on M1 Pro)

RWKV quantization

GPU support via cuBLAS

What's Changed

New Contributors

v1.3.0

Overview

What's Changed

New Contributors

v1.2.1

Overview

What's Changed

Core ggml / whisper

Bindings

Examples

C-style API

New Contributors

Highlights

Core `ggml` / `whisper`