XxHash Versions Save

Extremely fast non-cryptographic hash algorithm

v0.8.2

8 months ago

xxHash v0.8.2 is an incremental update featuring multiple small improvements and fixes spread out over ~300 commits.

Faster performance

Several updates by @easyaspi314 and @hzhuang1 impact arm platform, most notably the neon code path. On the M1 Pro, this translates into +20% speed for xxh3 and xxh128 (from 30.0 GB/s to 36 GB/s). Some of the changes are generic, so other platforms can be affected too, though typically to a lesser extend (~5%).

On wasm, speed fo xxh3 is improved by a large factor x2 to x3 (depending on underlying hardware) through the use of simd128 (@easyaspi314). This is especially efficient under the v8 js engine, notably used by chrome and node.js.

Finally, @hzhuang1 added support for the arm's SVE vector extension. This is useful for server-side aarch64 cpus with hardware support for wide vectors, such as Fujitsu's A64FX.

Fixes and improvements

Notable fixes in this update include the resolution of issues with XXH3 S390x vector implementation, PowerPC vector compilation with IBM XL compiler, and -Og compilation.

Furthermore, the command line interface (CLI) was refined with features such as support for comment lines in check files and commands such as --binary and --ignore-missing (@t-mat). Additionally, issues with filename containing /LF character were resolved.

The build process was also refined, with improvements such as fixing pkgconfig generation with cmake (@ilya-fedin), icc compilation, cmake install directories, and new build options to reduce binary size (@easyaspi314). Dedicated install targets were introduced (@ffontaine), and support for DISPATCH mode in cmake was added (@hzhuang1).

In terms of portability, the update includes the SVE vector implementation of XXH3, compatibility with freestanding environments using XXH_NO_STDLIB, and the ability to build on Haiku. The code has also been validated on m68k and risc-v.

Documentation

XXH3 finally has a written specification, thanks to @Adrien1018 ! Source code can also be digested by doxygen to generate code documentation automatically. An instance is now available at homepage.

Changelog

  • fix : XXH3 S390x vector implementation (@hzhuang1)
  • fix : PowerPC vector compilation with IBM XL compiler (@MaxiBoether)
  • perf : improved WASM speed by x2/x3 using SIMD128 (@easyaspi314)
  • perf : improved speed (+20%) for XXH3 on ARM NEON (@easyaspi314)
  • cli : Fix filename contain /LF character (@t-mat)
  • cli : Support # comment lines in --check files (@t-mat)
  • cli : Support commands --binary and --ignore-missing (@t-mat)
  • build: fix -Og compilation (@easyaspi314, @t-mat)
  • build: fix pkgconfig generation with cmake (@ilya-fedin)
  • build: fix icc compilation
  • build: fix cmake install directories
  • build: new build options XXH_NO_XXH3, XXH_SIZE_OPT and XXH_NO_STREAM to reduce binary size (@easyaspi314)
  • build: dedicated install targets (@ffontaine)
  • build: support DISPATCH mode in cmake (@hzhuang1)
  • portability: fix x86dispatch when building with Visual + clang-cl (@t-mat)
  • portability: SVE vector implementation of XXH3 (@hzhuang1)
  • portability: compatibility with freestanding environments, using XXH_NO_STDLIB
  • portability: can build on Haiku (@Begasus)
  • portability: validated on m68k and risc-v
  • doc : XXH3 specification (@Adrien1018)
  • doc : improved doxygen documentation (@easyaspi314, @t-mat)
  • misc : dedicated sanity test binary (@t-mat)

Full change list (github generated)

New Contributors

Full Changelog: https://github.com/Cyan4973/xxHash/compare/v0.8.1...v0.8.2

v0.8.1

2 years ago

xxHash v0.8.1 is a general clean up of the code base, following the stabilization of xxh3 and xxh128 in v0.8.0. There are a few welcomed evolutions and improvements, but for the most part, this release consists of fixes for multiple corner cases and scenarios, that shall improve usability of libxxhash and xxhsum across a wide range of platforms. Stable API entry points have not changed, all entry points labelled "stable" will continue to work as intended in this release and future ones.

Improved performance

While the "big picture" is unchanged, there are a few notable improvements. XXH3 / XXH128 feature a large speed improvement in streaming mode, which is particularly sensible for gcc and MSVC (clang was already in good shape), by as much as +40%, making streaming speed essentially on par with single-shot mode when ingesting large quantities of data.

XXH64 and even XXH32 feature improved latency performance for small inputs of random sizes. Perhaps as importantly, their binary size is smaller.

New capabilities

There is a new experimental XXH3 variant, named _withSecretandSeed(). In a nutshell, it combines seed for small inputs, with secret for large inputs. The main driver for this variant is a wish to skip the delay from secret's transparent generation when using _withSeed() variant with large inputs, resulting in measurable performance drop for "not so large" sizes (< 1 KB) (note: this delay is insensible for "large" inputs, such as > 256 KB). Coupled with new function XXH3_generateSecret_fromSeed(), which generates the same secret as the one generated internally when using the _withSeed() variant, it results in exactly the same return values, while skipping the secret generation stage, thus improving speed.

Experimental XXH3_generateSecret() has been extended to allow generation of secret of any size (though respecting the specification's minimum size). It's generally recommended to use this generator to ensure a source of "high entropy" for the secret.

On the CLI front, a highly demanded xxhsum feature was an ability to generate XXH3 checksum values. This is achieved in v0.8.1, using the --tag format, which ensures that XXH3 results cannot be confused with (default) XXH64 ones, even though they feature the same 64-bit width.

Detailed changelist

  • perf : much improved performance for XXH3 streaming variants, notably on gcc and msvc
  • perf : improved XXH64 speed and latency on small inputs
  • perf : small XXH32 speed and latency improvement on small inputs of random size
  • perf : minor stack usage improvement for XXH32 and XXH64
  • api : new experimental variants XXH3_*_withSecretandSeed()
  • api : updated XXH3_generateSecret(), can now generate secret of any size (>= XXH3_SECRET_SIZE_MIN)
  • cli : xxhsum can now generate and check XXH3 checksums, using command -H3
  • build: can build xxhash without XXH3, with new build macro XXH_NO_XXH3
  • build: fix xxh_x86dispatch build with MSVC, by @apankrat
  • build: XXH_INLINE_ALL can always be used safely, even after XXH_NAMESPACE or a previous XXH_INLINE_ALL
  • build: improved PPC64LE vector support, by @mpe
  • install: fix pkgconfig, by @ellert
  • install: compatibility with Haiku, by @Begasus
  • doc : code comments made compatible with doxygen, by @easyaspi314
  • misc : XXH_ACCEPT_NULL_INPUT_POINTER is no longer necessary, all functions can accept NULL input pointers, as long as size == 0
  • misc : complete refactor of CI tests on Github Actions, offering much larger coverage, by @t-mat
  • misc : xxhsum code base split into multiple specialized units, within directory cli/, by @easyaspi314

v0.8.0

3 years ago

Stable XXH3

After more than a year in the making, XXH3 has finally reached stable status, for both its 64-bit and 128-bit variants. While the code itself was in good enough shape for production use, the generated values could still change between versions. This limited XXH3 to local sessions only. From now on, output values produced by XXH3 for a given input and parameter set will remain identical across systems and across future versions. It makes it possible to store these values for later comparison, or to exchange them across network connections.

BSD-style checksums

Official stabilization being the main goal of this release, there are only minimal additional changes. A notable one though is the ability for xxhsum CLI to produce and check BSD-style checksum lines, using command --tag. One advantage of --tag format is that it explicitly specifies the algorithm and format used to represent the checksum. For example, it explicitly mentions if a checksum value follows the canonical format (XXH32) or the alternative little-endian format (XXH32_LE). Generating BSD-style checksum lines was actually already possible, but as the CLI was unable to --check them, it remained a hidden option. This situation changes with v0.8.0, thanks to a patch by @WayneD which makes it possible to --check BSD-style checksum lines.

Detailed list

  • api : stabilize XXH3
  • cli : xxhsum can produce BSD-style lines, with command --tag
  • cli : xxhsum can parse and check BSD-style lines, using command --check, by @WayneD
  • cli : xxhsum - accepts console input, requested by @jaki
  • cli : xxhsum accepts -- separator, by @jaki
  • cli : fix : print correct default algo for symlinked helpers, by @martinetd
  • install: improved pkgconfig script, allowing custom install locations, requested by @ellert

v0.7.4

3 years ago

xxHash v0.7.4 is the last evolution of xxh3 and xxh128, primarily designed to finalize the algorithm. It is considered release candidate for v0.8.0, which means that if all goes right, this version will rebranded v0.8.0, almost "as is", within the next few weeks, after receiving sufficient feedback. v0.8.0 is the official version after which XXH3 and XXH128 are considered "stabilized", meaning that return values will never change given the same input and seed, making the hash suitable for long-term storage and transmission.

Beyond these "final touches", the new version also brings a few notable improvements.

Automatic vector detection

x86/x64 systems can enjoy a new unit, xxh_x86dispatch, which can detect at runtime the best vector instruction set present on host system (none, sse2, avx2 or avx512), thanks to a cpu feature detector designed by @easyaspi314. It then automatically runs the appropriate vector code. This makes it safer to deploy a single binary with advanced vector instruction sets, such as AVX2, since there is no hard requirement for all target systems to actually support it : the binary can automatically switch to SSE2 instead. As a proof of concept, the windows builds provided alongside this release are compiled with this new capability.

AVX512 support

A new vector instruction set is supported, thanks to @gzm55 : AVX512. It can be applied on XXH3 and XXH128, using some of the most recent Intel cpus, such as IceLake on laptop. It typically offers +50% more performance compared to AVX2.

Secret Generator

Advanced users can be interested in the highly customizable variant _withSecret(), which makes it possible to run XXH3 and XXH128 algorithms using one's own secret. However, the quality of the hash depends on the high entropy (randomness) of the secret. And sometimes, it can be difficult to ensure that the candidate secret is "random enough". In order to produce a secret of high quality, a new function XXH3_generateSecret() is proposed in the advanced API section. It will convert any blob of bytes, named customSeed, into a high quality secret which respects all conditions expected by XXH3 and XXH128. This is true even if customSeed itself is of poor quality, such as a bunch of \0 bytes or some short or repeated common sequence.

No API modification

The existing API present in 0.7.3 has remained unchanged in 0.7.4. Any programs linking with 0.7.3 should continue to work as-is. Note however that xxh3/xxh128 return values are not comparable across these versions. 0.7.x are labelled development versions, and should only be used for ephemeral data (hash produced and consume in the same local session). (note : this limitation does not extend to XXH32 and XXH64, which are considered fully stable and specified).

Changelist

There are multiple smaller bug fixes and minor improvements that have been brought to this repository by great contributors. Here is a summarized list:

  • perf: automatic vector detection and selection at runtime (xxh_x86dispatch.h), initiated by @easyaspi314
  • perf: added AVX512 support, by @gzm55
  • api : new: secret generator XXH_generateSecret(), suggested by @koraa
  • api : fix: XXH3_state_t is movable, identified by @koraa
  • api : fix: state is correctly aligned in AVX mode (unlike malloc()), by @easyaspi314
  • api : fix: streaming generated wrong values in some combination of random ingestion lengths, reported by @WayneD
  • cli : fix unicode print on Windows, by @easyaspi314
  • cli : can -c check file generated by sfv
  • build: make DISPATCH=1 generates xxhsum and libxxhash with runtime vector detection (x86/x64 only)
  • install: cygwin installation support
  • doc : Cryptol specification of XXH32 and XXH64, by @weaversa

v0.7.3

4 years ago

xxHash v0.7.3 is major evolution for xxh3 and xxh128, with a focus on speed and dispersion performance.

Speed improvements

v0.7.3 pays a lot of attention to small data, by delivering generally faster latency metrics (about +10%).

Inlining is now a first class citizen, as it is generally key to best performance on small inputs. Among the visible changes:

  • XXH_INLINE_ALL can always be set before including xxhash.h, even if xxhash.h was previously included (for example transitively, as part of a prior *.h header file).
  • The algorithm implementation has been transferred into xxhash.h. It's no longer necessary to keep a copy of xxhash.c in the /include directory for inlining to work correctly.
    • Note: xxhash.c still exists, as it's useful to instantiate xxhash functions as public symbols accessible from a library or a *.o object file. It also remains compatible with existing projects.

Large data has also received a boost, which can go up to +20% for very large samples (> many MB).

Let's underline the remarkable optimization work of @easyaspi314, who hand optimized several hot loops and instructions, and even added a new Z-vector target for s390x hardware.

No API modification

The API has remained completely stable between 0.7.2 and 0.7.3. Any programs linking with 0.7.2 should work as-is. Note that xxh3/xxh128 results are not comparable across these versions.

New test tool

Testing a 64-bit hash algorithm for its collision rate has remained elusive for most. The sheer volume of data required to assess quality at this scale is too large for traditional test tools like SMHasher. As a general guide, it requires 4 billion hashes to reach a 50% probability of getting a single collision. Accurate collision ratio evaluation requires many more hashes to actually measure something meaningful.

A new open-source tool in tests/collisions offers this capability. It requires a lot of memory to run, with a minimum of 32 GB to measure anything significant. But provided that one has a system with enough capacity, it can accurately measure the collision ratio of any 64-bit hash algorithm.

Several algorithms were measured thanks to this tool, the result of which is currently consolidated on this wiki page. More can be added in the future.

This new development round also introduced several improvements to the SMHasher test suite, uncovering new requirements for new scenarios. This proved beneficial to improve the general dispersion qualities of xxh3 and xxh128.

Changelist

Here is a summarized list of changes for this version:

  • perf: improved speed for large inputs (~+20%)
  • perf: improved latency for small inputs (~10%)
  • perf: s390x Vectorial code, by @easyaspi314
  • cli: Improved support for Unicode filenames on Windows, thanks to @easyaspi314 and @t-mat
  • api: xxhash.h can now be included in any order, multiple times, with and without XXH_STATIC_LINKING_ONLY or XXH_INLINE_ALL
  • build: xxHash's implementation has been transferred into xxhash.h. There is no more need to have xxhash.c in the /include directory for XXH_INLINE_ALL to work
  • install: created pkg-config file, by @bket
  • install: VCpkg installation instructions, by @LilyWangL
  • doc: Highly improved code documentation, by @easyaspi314
  • misc: New test tool in /tests/collisions: brute force collision tester for 64-bit hashes

v0.7.2

4 years ago

This a maintenance release, focused on the newer 128-bit variant. Note that XXH3 is still labelled experimental : return values from this version are not comparable with other versions.

  • Fixed collision ratio of XXH128 for some specific input lengths, reported by @svpv
  • Improved VSX and NEON variants, by @easyaspi314
  • Improved performance of scalar code path (XXH_VECTOR=0), by @easyaspi314
  • xxhsum : can generate 128-bit hash with command -H2 (note : for experimental purposes only ! XXH128 is not yet frozen)
  • xxhsum : option -q removes status notifications

v0.7.1

4 years ago

The main feature of this release is an update of XXH3, building upon many user feedbacks during this test period. The main points are :

  • Secret first : the algorithm computation can be altered by providing a "secret", which is any blob of bytes, of size >= XXH3_SECRET_SIZE_MIN.
  • seed is still available, and acts as a secret generator
  • As a consequence of these changes, note that new return values of XXH3 are not compatible with v0.7.0
  • updated ARM NEON variant by @easyaspi314
  • Streaming implementation is available
  • Improve compatibility and performance with Visual Studio, with help from @aras-p
  • Better integration when using XXH_INLINE_ALL : do not pollute host namespace, use its own macros, such as XXH_ASSERT(), XXH_ALIGN, etc.
  • 128-bits variant provide helper function, for comparison of hashes.

Note that XXH3 is still considered experimental at this stage. It will have to remain stable for at least 2 releases before being branded "stable". After which stage, the algorithm and produced results will no longer evolve.

Several general improvements are also present in this release :

  • Better clang generation of rotl instruction, thanks to @easyaspi314
  • XXH_REROLL build macro, to reduce binary size, by @easyaspi314
  • Improved cmake script, by @Mezozoysky
  • Full benchmark program provided in /tests/bench

v0.7.0

5 years ago

The main highlight of this release is the introduction of XXH3, a new hash algorithm offering much improved speed, for both large and small inputs.

XXH3 is still labelled experimental, and must be unlocked with macro XXH_STATIC_LINKING_ONLY. The source code is located into its own xxh3.h file, which is automatically included (and therefore required) by xxhash.c. It's also possible to include xxh3.h directly, which will have a similar effect as triggering XXH_INLINE_ALL. At this stage, XXH3 is suitable for ephemeral data and tests, but avoid storing long term hash values yet. XXH3 will be transferred into stable in a future release, after a period dedicated to gather users' feedback.

For more details on XXH3 performance, see this article.

note : there are known compilation issues under Visual Studio, which have been later fixed in dev branch.

v0.6.5

5 years ago
  • Improved performance on small keys, thanks to suggestions from Jens Bauer
  • New build macro, XXH_INLINE_ALL, extremely effective for small keys of fixed length (see this article for details)
  • XXH32() : better performance on OS-X clang by disabling auto-vectorization
  • Improved benchmark measurements accuracy on small keys
  • Included xxHash specification document

v0.6.4

6 years ago
  • build: new target make lib
  • build: make install also installs library libxxhash
  • build: cmake builds library by default