Kernel Tuner
Finally, the Version 1.0 release is here! The software has been stable and ready for production use for quite some time now and after being in beta for about a half a year, we are confident that the current version of the software deserves to mark the first major release of Kernel Tuner.
Version 1.0 integrates a lot of new functionality, including blazing fast search space construction, support for tuning HIP kernels on AMD GPUs, new functionality for mixed precision and accuracy tuning, experimental support for tuning OpenACC programs, a conda package installer for Kernel Tuner, and many more changes and additions.
I would like to thank every one involved in the development of Kernel Tuner of the past years! Special thanks to the Kernel Tuner developers team for their continued support of the project!
PySMT
and ATF
for searchspace buildingsetup.py
and setup.cfg
to pyproject.toml
for centralized metadata, added relevant testspyproject.toml
metadata, minor fixes and changes to be compatible with updated dependenciesOrderedDict
, as all dictionaries in the Python versions used are already orderedFull Changelog: https://github.com/KernelTuner/kernel_tuner/compare/0.4.5...1.0
This is a beta release for early access to the new features. Not intended for production use.
The release contains:
This is a beta release for early access to the new features. Not intended for production use.
The release contains:
This is a beta release for early access to the new features. Not intended for production use.
The release contains:
Full Changelog: https://github.com/KernelTuner/kernel_tuner/compare/1.0.0b4...1.0.0b5
This is a beta release for early access to the new features. Not intended for production use.
This release contains several improvements:
nvidia-ml-py
added to tutorial
extra dependencies.This is a beta release for early access to the new features. Not intended for production use.
This version contains several bugfixes:
check_restrictions
function.bayes_opt
would not handle pruned parameters correctly.Full Changelog: https://github.com/KernelTuner/kernel_tuner/compare/1.0.0b2...1.0.0b3
This is a beta release for early access to the new features. Not intended for production use.
Full Changelog: https://github.com/KernelTuner/kernel_tuner/compare/1.0.0b1...1.0.0b2
This is a beta release for early access to the new features. Not intended for production use.
Full Changelog: https://github.com/KernelTuner/kernel_tuner/compare/0.4.5...1.0.0b1
Version 0.4.5 adds support of using PMT in combination with Kernel Tuner enabling power and energy measurements on a wide range of devices. In addition, we have worked extensively on the internals of Kernel Tuner and the interfaces of the separate components that together make up Kernel Tuner. Along with a few bugfixes, fixes of small errors in examples and documentation.
Version 0.4.4 adds extended support for energy efficiency tuning. In particular, with the new capability to fit a performance model to the target GPUs power-frequency curve. How to use these features is demonstrated in: https://github.com/KernelTuner/kernel_tuner/blob/master/examples/cuda/going_green_performance_model.py
And described in the paper:
Going green: optimizing GPUs for energy efficiency through model-steered auto-tuning R. Schoonhoven, B. Veenboer, B. van Werkhoven, K. J. Batenburg International Workshop on Performance Modeling, Benchmarking and Simulation of High Performance Computer Systems (PMBS) at Supercomputing (SC22) 2022 https://arxiv.org/abs/2211.07260
Other than that, we've implemented a new output and metadata JSON format that adheres to the 'T4' auto-tuning schema created by the auto-tuning community at the Lorentz Center workshop in March 2022.
From the changelog: