Likwid Versions Save

Performance monitoring and benchmarking suite

v5.3.0

5 months ago

We are happy to release version 5.3.0 of LIKWID, the tools suite for performance oriented programmers. Thanks to all the contributors, especially HPE for the AMD ROCm backend.

Changelog for 5.3.0:

  • Support for Intel SapphireRapids (Core, Uncore, RAPL)
  • Support for AMD Zen4 (Core, Uncore, RAPL)
  • Support for Apple M1
  • Support for AMD GPUs (MarkerAPI, F90 interface)
  • Support for AWS Graviton3 (ARM Neoverse V1)
  • Support for HiSilicon TSV110
  • Fix of F90 interface installation
  • Support for extended umasks in ICX and SPR
  • Units for metrics in performance groups
  • Library calls to get meta information (version, supported features, etc.)
  • Some fixes for direct access mode
  • Some fixes for X86 RDPMC detection
  • Update of internal hwloc (2.9.3) and Lua (5.4.6) version
  • New experimental sysfeatures module

Note: For Intel SapphireRapids systems with HBM, LIKWID in perf_event access mode and /sys/devices/uncore_type_14_* devices, apply the attached patch. Thanks @Julius-Plehn Note: There is a bug in the NVMarkerAPI. If you want to use LIKWID with the NvMarkerAPI, please apply the changes in likwid-marker.h shown here

v5.2.2

1 year ago
  • Fix pin string parsing in pinning library
  • Make SBIN path configurable in build system
  • Add PKGBUILD for ArchLinux package builds
  • Remove accessDaemon double-fork in systemd environments
  • Group updates for L2/L3 (mainly AMD Zen)
  • Fix multi-initialization in MarkerAPI
  • Add energy event scaling for Fujitsu A64FX
  • Nvmon: Use Cupti error string to get better warning/error messages
  • Nvmon: Store events internally to re-use event strings in stopCounters
  • AccessLayer: Catch SIGCHLD to stop sending requests to accessDaemon if it was killed
  • likwid-genTopoCfg: Update writing and reading of topology file
  • Add INST_RETIRED_NOP event for Intel Icelake (desktop & server)
  • Removed some memory leaks
  • Improved checks for RDPMC availability
  • Add TOPDOWN_SLOTS for perf_event
  • Fix for systems with CPU sockets without hwthreads (A64FX FX1000)
  • Fix if HOME environment variable is not set (systemd)
  • Reader function for perf_event_paranoid in Lua to get state early
  • likwid-mpirun: Sanitize np and ppn values to avoid crashes

Note: The groups MEM_DP and MEM_SP use only 6 of 8 memory controllers for Intel Icelake SP. The attached patch fixes both groups.

v5.2.1

2 years ago

We are happy to release a new bugfix version of the LIKWID tool suite.

  • Add support for Intel Rocketlake and AMD Zen3 variant (Family 19, Model 0x50)
  • Fix for perf_event multiplexing (important!)
  • Fix for potential deadlock in MarkerAPI (thx @jenny-cheung)
  • Build and runtime fixes for Nvidia GPU backend, updates for CUDA test codes
  • peakflops kernel for ARMv8
  • Updates for AMD Zen1/2/3 event lists and groups
  • Support spaces in MarkerAPI region tags (thx @jrmadsen)
  • Use 'online' cpulist instead of 'present'
  • Switch CI from Travis-CI to NHR@FAU Cx services
  • Document -reset and -ureset for likwid-setFrequencies
  • Reset cpuset in unpinned runs
  • Remove destructor in frequency module
  • Check PID if given through --perfpid
  • Intel Icelake: OFFCORE_RESPONSE events
  • AccessDaemon: Check PCI init state before using it
  • likwid-mpirun: Set mpi type for SLURM automatically
  • likwid-mpirun: Fix for skip mask for OpenMPI
  • Fix for triad_sve* benchmarks

Note: The groups MEM_DP and MEM_SP use only 6 of 8 memory controllers for Intel Icelake SP. The attached patch fixes both groups.

v5.2.1-rc2

2 years ago

Second release candidate for the 5.2.1 release:

  • Add support for Intel Rocketlake and AMD Zen3 variant (Family 19, Model 0x50)
  • Fix for perf_event multiplexing (important!)
  • Fix for potential deadlock in MarkerAPI (thx @jenny-cheung)
  • Build and runtime fixes for Nvidia GPU backend, updates for CUDA test codes
  • peakflops kernel for ARMv8
  • Updates for AMD Zen1/2/3 event lists and groups
  • Support spaces in MarkerAPI region tags (thx @jrmadsen)
  • Use 'online' cpulist instead of 'present'
  • Switch CI from Travis-CI to NHR@FAU Cx services
  • Document -reset and -ureset for likwid-setFrequencies
  • Reset cpuset in unpinned runs
  • Remove destructor in frequency module
  • Check PID if given through --perfpid
  • Intel Icelake: OFFCORE_RESPONSE events
  • AccessDaemon: Check PCI init state before using it
  • likwid-mpirun: Set mpi type for SLURM automatically
  • likwid-mpirun: Fix for skip mask for OpenMPI

v5.2.1-rc1

2 years ago

This is release candidate 1 for the 5.2.1 release of LIKWID

Changelog:

  • Add support for Intel Rocketlake and AMD Zen3 variant (Family 19, Model 0x50)
  • Fix for perf_event multiplexing (important!)
  • Fix for potential deadlock in MarkerAPI (thx @jenny-cheung)
  • Build and runtime fixes for Nvidia GPU backend, updates for CUDA test codes
  • peakflops kernel for ARMv8
  • Updates for AMD Zen1/2/3 event lists and groups
  • Support spaces in MarkerAPI region tags (thx @jrmadsen)
  • Use 'online' cpulist instead of 'present'
  • Switch CI from Travis-CI to NHR@FAU Cx services
  • Document -reset and -ureset for likwid-setFrequencies
  • Reset cpuset in unpinned runs
  • Remove destructor in frequency module
  • Check PID if given through --perfpid

v5.2.0

2 years ago

We are happy to release a new major update of the LIKWID tool suite.

  • Support for AMD Zen3 (Core + Uncore)
  • Support for Intel IcelakeSP (Core + Uncore)
  • New affinity code
  • Fix for Ivybridge uncore code
  • Bypass accessdaemon by using rdpmc instruction on x86_64
  • Introduce notion of CPU die in topology module
  • Use CPU dies for socket-lock for Intel CascadelakeAP
  • Add environment variable LIKWID_IGNORE_CPUSET to break out of current CPUset
  • Fixes for affinity module CPUlist sorting
  • Build against system-installed hwloc
  • Update for Intel SkylakeX/CascadelakeX L3 group
  • Rename DataFabric events for all generations of AMD Zen
  • Add static cache configuration for Fujitsu A64FX
  • Add multiplexing checks for perf_event backend
  • Fix for table width of likwid-topology after adding CPU die column
  • Adding RasPi 4 with 32 bit OS as ARMv7
  • Add default groups for Intel Icelake desktop
  • Fix for likwid-setFrequencies to not apply minFreq when setting governor
  • likwid-powermeter: Fix hwthread selection when run with -p
  • likwid-setFrequencies: Get measured base frequency if register is not readable
  • CLOCK group for all AMD Zen
  • Fixes in Nvidia GPU support in NvMarkerAPI and topology module

WARNING: This version has bugs in the perf_event backend. The multiplexing checks cause problems. WARNING: The benchmarks triad_sve* for ARM8 chips use only 3 instead of 4 streams. Note: The groups MEM_DP and MEM_SP use only 6 of 8 memory controllers for Intel Icelake SP. The attached patch fixes both groups.

v5.1.1

3 years ago

Changelog for version 5.1.1:

  • Support for Intel Cometlake desktop (Core + Uncore)
  • Fix for topology module of Fujitsu A64FX
  • Fix for Intel Skylake SP in SNC mode
  • Fix for likwid-perfscope
  • Fix for CLI argument parsing
  • Updated group and data file checkers
  • Vector sum benchmark in SVE
  • FP_PIPE group for Fujitsu A64FX
  • Maximal number of CLI arguments configurable in config.mk (currently 16384)
  • Fix for cpulist_sort function
  • Fix for Intel SkylakeSP/CascadelakeSP CBOX devices in perf_event mode
  • Multiplexing-Fix for perf_event (with warning)
  • Adjust CUDA function pointer names in topology_gpu to avoid name clashes
  • Fix for Lua 5.1
  • Fix for likwid-setFrequency when reading CPU base frequency

Note: This version does not contain any updates for AMD Zen3 and Intel IcelakeSP. Note: Uncore measurements on Intel Cascadelake AP systems require an update of the topology module which will come in 5.2.0 WARNING: The benchmarks triad_sve* for ARM8 chips use only 3 instead of 4 streams.

v5.1.0

3 years ago

Changelog for version 5.1.0:

  • Support for Intel Icelake desktop (Core + Uncore)
  • Support for Intel Icelake server (Core only)
  • Support for Intel Tigerlake desktop (Core only)
  • Support for Intel Cannonlake (Core only)
  • Support for Nvidia GPUs with compute capability >= 7.0 (CUpti Profiling API)
  • Initial support for Fujitsu A64FX (Core) including SVE assembly benchmarks
  • Support for ARM Neoverse N1 (AWS Graviton 2)
  • Support for AMD Zen3 (Core + Uncore but without any events)
  • Check for Intel HWP
  • Fix for TID filter of Skylake SP LLC filter0 register
  • Fix for Lua 5.1
  • Fix for likwid-mpirun skip masks
  • Fortran90 interface for NvMarkerAPI (update)
  • CPU_is_online check to filter non-usable CPU cores
  • Fix for freeMemory in NUMA module (with hwloc backend)
  • Fix for likwid-setFrequencies

We want to thank Intel, AMD, AWS and the University of Regensburg for their support.

If you want to use this release in a publication, please cite: https://doi.org/10.5281/zenodo.4282696

v5.0.2

3 years ago

Changelog for 5.0.2:

  • Fix memory leak in calc_metric()
  • New peakflops benchmarks in likwid-bench
  • Fix for NUMA domain handling properly
  • Improvements for perf_event backend
  • Fix for perfctr and powermeter with perf_event backend
  • Fix for likwid-mpirun for SLURM with cpusets
  • Fix for likwid-setFrequencies in cpusets
  • Update for POWER9 event list
  • Updates for AMD Zen, Zen+ and Zen2 (events, groups)
  • Fix for Intel Uncore events with same name for different devices
  • Fix for file descriptor handling
  • Fix for compilation with GCC10
  • Remove sleep timer warning
  • Update examples C-markerAPI and C-internalMarkerAPI

Note: If you want to use LIKWID 5.0.2 with Lua 5.1, please apply this patch

v5.0.1

4 years ago

I'm happy to announce a new bugfix release of LIKWID 5.

  • Some fixes for likwid-mpirun
    • Fix for hybrid pinning with multiple hosts
    • Fix for perf.groups without core-local events (switch to likwid-pin)
    • Fix for command line parser
    • For for mpiopts parameter
    • Add UPMC as Uncore counter to splitUncoreEvents()
    • Expand user-given input to abspath if possible
    • Check for at least one executable in user-given command
    • Add skip mask for SLURM + Intel OpenMP
    • Check if user-given MPI type is available
  • Fix for perf_event backend when used as root
  • Include likwid-marker.h in likwid.h to not break old MarkerAPI code
  • Enable build with ARM HPC compiler (ARMCLANG compiler setting)
  • Fix creation of likwid-bench benchmarks on POWER platforms
  • Fix for build system in NVIDIA_INTERFACE=BUILD_APPDAEMON=true
  • Update for executable tester
  • Update for MPI+X test (X: OpenMP or Pthreads)

Merry Christmas