Jiwer Versions Save

Evaluate your speech-to-text system with similarity measures such as word error rate (WER)

v3.0.3

8 months ago

Full Changelog: https://github.com/jitsi/jiwer/compare/v3.0.2...v3.0.3

v3.0.2

11 months ago

What's Changed

add option to skip correct pairs in visualization by @nikvaessen in https://github.com/jitsi/jiwer/pull/79

Full Changelog: https://github.com/jitsi/jiwer/compare/v3.0.1...v3.0.2

v3.0.1

1 year ago

What's Changed

fix docstring by @nikvaessen in https://github.com/jitsi/jiwer/pull/75
fix bug in deprecation of truth by @nikvaessen in https://github.com/jitsi/jiwer/pull/77

Minor release for fixing #76 .

Full Changelog: https://github.com/jitsi/jiwer/compare/v3.0.0...v3.0.1

v3.0.0

1 year ago

What's Changed

This release makes breaking changes to the jiwer API.

First, we introduce 3 new methods:

1.jiwer.compute_measures() is renamed to jiwer.process_words, and returns everything in a dataclass named WordOutput. 2.jiwer.cer(return_dict=True) is deprecated, and is superseded by jiwer.process_characters, which returns everything in a dataclass named CharacterOutput 3. jiwer.visualize_measures is renamed to jiwer.visualize_alignment. Moreover, the keyword argument visualize_cer: bool = False has been removed, and the output keyword argument is now of expected type Union[WordOutput, CharacterOutput].

I've also decided to rename all mentions of the concept "(ground)truth" to "reference", in the light of the Whisper speech-to-text model showing that future ASR models might not trained on something like a "ground truth". Therefore, in the following methods, the keyword arguments truth and truth_transform have been renamed to reference and reference_transform:

jiwer.cer()
jiwer.mer()
jiwer.wer()
jiwer.wil()
jiwer.wip()

The alignments are now stored as a list of lists containing jiwer.AlignmentChunk dataclass objects instead of hard-to-document tuples.

Lastly, I've added jiwer.transformations.cer_contiguous for optionally calculating the CER with uneven amount of reference and hypothesis sentences. I've also changed the wer_standardize and wer_standardize_contiguous so that the last 3 transformations are now:

        tr.Strip(),
        tr.ReduceToSingleSentence(),
        tr.ReduceToListOfListOfWords(),

This releases also introduced a documentation website. See https://jitsi.github.io/jiwer.

Full Changelog: https://github.com/jitsi/jiwer/compare/v2.6.0...v3.0.0

v2.6.0

1 year ago

What's Changed

The return dictionary of jiwer.cer() and jiwer.compute_measures() now has 3 addional keys: ops, truth, and hypothesis. See the alignment section of the README, and the doc-strings of the methods, for more details.

Also adds the jiwer.visualize_measures() to visualize the alignment of all ground-truth/hypothesis pairs.

Finally, the jiwer command is automatically installed upon installation of jiwer, which provides a simple CLI for interacting with jiwer.

Commit list:

Alignments and a CLI interface by @nikvaessen in https://github.com/jitsi/jiwer/pull/72

Full Changelog: https://github.com/jitsi/jiwer/compare/v2.5.2...v2.6.0

v2.5.2

1 year ago

What's Changed

move to rapidfuzz library by @nikvaessen in https://github.com/jitsi/jiwer/pull/71

Full Changelog: https://github.com/jitsi/jiwer/compare/v2.5.1...v2.5.2

v2.5.1

1 year ago

What's Changed

compute the list of punctuation characters only once. by @f4hy in https://github.com/jitsi/jiwer/pull/67

New Contributors

@f4hy made their first contribution in https://github.com/jitsi/jiwer/pull/67

Full Changelog: https://github.com/jitsi/jiwer/compare/v.2.5.0...v2.5.1

v.2.5.0

1 year ago

What's Changed

Handle non-ascii punctuation in RemovePunctuation transform by @nikvaessen in https://github.com/jitsi/jiwer/pull/63
Fix bug in RemoveSpecificWords matching on partials by @nikvaessen in https://github.com/jitsi/jiwer/pull/64
Remove depricated keywords standardize and words_to_filter by @nikvaessen in https://github.com/jitsi/jiwer/pull/65

Full Changelog: https://github.com/jitsi/jiwer/compare/v2.4.0...v.2.5.0

v2.4.0

1 year ago

What's Changed

remove mentions of old transform SentencesToListOfWords by @nikvaessen in https://github.com/jitsi/jiwer/pull/55
drop support for python 3.6 and update to poetry 1.2.0 by @nikvaessen in https://github.com/jitsi/jiwer/pull/62
Update python-levenshtein with levenshtein by @BramVanroy in https://github.com/jitsi/jiwer/pull/61

New Contributors

@BramVanroy made their first contribution in https://github.com/jitsi/jiwer/pull/61

Full Changelog: https://github.com/jitsi/jiwer/compare/v2.3.0...v2.4.0

v2.3.0

2 years ago

What's Changed

add CER, default transformation is invariant to permutations #48 by @nikvaessen in https://github.com/jitsi/jiwer/pull/49
deprecate the words_to_filter and standardize keywords in all measure functions.

Full Changelog: https://github.com/jitsi/jiwer/compare/v2.2.1...v2.3.0