Cri Resource Manager Versions Save

Kubernetes Container Runtime Interface proxy service with hardware resource aware workload placement policies

v0.4.1

3 years ago

The documentation in this release has been overhauled with significant structural improvements and additional content over previous ones. End-to-end test coverage has been vastly extended and the test framework significantly improved. This release contains a number of important bug fixes and a few other functional improvements. Here is a non-exhaustive list of these.

Bug fixes

agent:
- refuse to start if NODE_NAME environment variable is not specified
memtier policy:
- fix updating containers after shared pool changes
- honor CPU isolation opt-out preference
- honor allowed CPUs in resource discovery
- fix PMEM-only NUMA node assignment for weird topologies
static-pools policy:
- make dynamic (re-)configuration work properly
- look for cmk isolate when parsing container command line
- re-load legacy config on config update
- only take pools configuration from legacy config
- improved sanity check on pool configuration
- fix node tainting
cri-resmgr:
- fill in defaults for unspecified values in configuration

Other Improvements

cri-resmgr:
- dump outbound requests if debugging is enabled for the 'cri/relay' source
resource controllers:
- page-migrate: split out page-migration into a controller of its own
e2e test framework
- vastly improved test coverage on multiple distros
builds:
- build binary dist tarballs

Difference wrt. Rolling Master

With the exception of the PRs listed below, all others in the inclusive range #411 - #527 has been cherry-picked or back-ported from the rolling master branch to this release. The omitted PRs have been excluded due to backwards compatibility or other similar reasons:

~~#525: cri-resmgr: reuse 'rdt' logger for the split out rdt package~~
~~#490: rdt: use goresctrl~~
~~#497: pkg/log: switch logger to use klog~~
~~#472: e2e: add tests for static-pools~~
~~#489: static-pools: slight refactoring and renaming~~
~~#483: static-pools: lazier node updates~~
~~#475: static-pools: drop all cmdline flags~~

v0.4.0

3 years ago

Major changes

'topology-aware' policy superseded by 'memtier'
support for cold start of containers
support for dynamic demotion of memory
support for limiting container top tier/DRAM memory usage (require kernel support)
support for externally adjusting container resource assignments
multi-die aware resource allocation
binary distribution with packages for popular Linux distributions and images at Docker Hub

Detailed changelog

Policies

'topology-aware' policy superseded by 'memtier', which
- is a forked and improved version of 'topology-aware'
- has the same basic functionality
- has a number of improvements and extra functionality:
  - multi-die topology support
  - multi-tier (DRAM/PMEM) memory support
  - top tier/DRAM memory limiting
  - container 'cold start' support: force containers initially exclusively to PMEM
  - experimental dynamic page demotion: periodically move least-used pages from DRAM to PMEM
  - experimental support for dynamic external adjustments to container resource assignments
- has a bunch of resource assignment/allocation fixes (which are not backported to 'topology-aware' any more)
- will in the next release replace 'topology-aware' altogether
static-pools:
- compatibility back-ports from CMK: advertise CPUs in 'shared', 'infra' pools via CMK_CPUS_SHARED, CMK_CPUS_INFRA environment variables
common:
- support for new Pod annotation controls:
  - opt out from automatic topology hint generation:
    - topologyhints.cri-resource-manager.intel.com/pod: false
    - topologyhints.cri-resource-manager.intel.com/container.$name: false
  - set DRAM/top tier memory limit:
    - toptierlimit.cri-resource-manager.intel.com/pod: $limit
    - toptierlimit.cri-resource-manager.intel.com/container.$name: $limit
- make simple container affinities always implicitly symmetric
- limit user-defined container affinity to [-1000,1000]
- re-trigger pod cgroupfs parent directory and QoS class discovery if necessary

Resource controllers

RDT:
- remove controller-level class name mapping
- don't consider assignment to a default class an error if no classes are defined
- fix crash/misplaced logging of group deletion
Block I/O:
- remove controller-level class name mapping
- don't consider assignment to a default class an error if no classes are defined
CRI:
- properly send out generated/queued UpdateContainerResources requests

Data collectors

cgroupstats:
- use/report container IDs
- fix hugetlb size parsing
avx:
- switch to cilium/ebpf from iovisor/gobpf

cri-resmgr

new command line options:
- reset cached configuration: --reset-config
- reset cached policy data: --reset-policy
always set up node agent connection, even when running with --force-config
allow switching policies during startup, unless started with --disable-policy-switch

Packaging

install sample fallback config as fallback and not real config file
use /etc/default for defaults on debian-based distros
support Ubuntu 20.04, OpenSUSE 15.2

Documentation

automatic generation and publishing of documentation to github pages
- https://intel.github.io/cri-resource-manager
a number of documentation fixes and clarifications

Testing

end-to-end test framework added

v0.3.1

3 years ago

This v0.3.1 patch release adds packaging and build fixes on top of the v0.3.0 release.

Changes:

feature: add command line options for resetting the active policy in the cache and allow this to happen automatically during startup if necessary
fix: NUMA CPU-/memory-attachment detection code to work with older kernels
fix: move from gobpf to Cilium-based AVX eBPF implementation to address build issues on older kernel
fix: add targets for containerized cross-builds for distro packages

v0.3.0

3 years ago

added memory-tiering policy: topology-aware policy with support for DRAM, PMEM (Intel Optate DC) and HBM (High Bandwidth Memory) allocation
added blockio controller: class-based control over block I/O using the cgroupfs blkio controller
added support for metrics collection:
- collection of raw metrics data, exporting to Prometheus
- AVX512 usage: collect per container AVX512 instruction usage, tag containers accordingly
rdt controller improvements: disjoint partitioning, L3 and memory bandwidth monitoring, and Intel RDT metrics
new annotations:
- assign full pod or a container to block I/O or RDT class:
  - rdtclass.cri-resource-manager.intel.com/container.$container: class-name
  - rdtclass.cri-resource-manager.intel.com/pod: class-name
  - blockioclass.cri-resource-manager.intel.com/container.$container: class-name
  - blockioclass.cri-resource-manager.intel.com/pod: class-name
- memtier policy preference for type of memory allocated to a container:
  - memory-type.cri-resource-manager.intel.com: $container: [dram,][pmem,][hbm]

v0.2.0

4 years ago

Implement a more general, unified mechanism for handling runtime configuration.

v0.1.0

4 years ago

Initial release for the project with major functionality available in alpha state.

Note: this is pre-production Alpha release. Not for production use!