Cri Resource Manager Versions Save

Kubernetes Container Runtime Interface proxy service with hardware resource aware workload placement policies

v0.6.0

3 months ago

This release brings dependencies up to date with recent versions. It contains a small number of functional improvements and fixes, and a large number of fixes and other improvements to the end-to-end tests.

Major Changes

  • build:
    • update K8s dependencies to v1.22.2
    • bump golang version to v1.16
  • fixes and improvements:
    • container cgroup directory discovery fixes
    • RDT pod QoS class discovery fixes in discovery mode
    • agent configuration: authorize access to adjustments
    • clean up cgroup and group control abstraction
    • remove SST code and pull it in from goresctrl
  • end-to-end test framework
    • new distributions: sles, opensuse-tumbleweed, ubuntu-21.04
    • installing and debugging locally built CRI-O, containerd and runc
    • configurable CRI runtime pipe and Kubernetes version

Other improvements

  • testing, demos:
    • end-to-end tests: a large number of end-to-end test fixes and other test infra improvements
    • blockio demo: fix detecting already installed cri-resmgr
    • blockio demo: always drop caches before measuring blockio speed

List of Merged PRs

  • PR #731: e2e: more robust coldstart test
  • PR #730: 0.6.0 release preparation: always try to enable 'SystemdCgroup = true' for tests with containerd.
  • PR #728: 0.6.0 release preparation: use distinctive VM names for packaging tests.
  • PR #729: 0.6.0 release preparation: add support for testing with cross-built distro binaries.
  • PR #725: 0.6.0 release preparation: ubuntu-21.04 cross-build and tests.
  • PR #727: 0.6.0 release preparation: centos-7 test cluster bootstrapping fixes.
  • PR #726: 0.6.0 release preparation: use latest fedora image for cross-build.
  • PR #724: 0.6.0 release preparation: update sid image URL.
  • PR #721: e2e: add support for distro=ubuntu-21.04
  • PR #722: go.mod: update to K8s deps to v1.22.2
  • PR #720: Bump to golang v1.16
  • PR #719: distro: force non-interactive 'apt-get install'.
  • PR #717: Drop travis CI support
  • PR #711: Integrate with goresctrl
  • PR #715: github: run tests before golanci-lint
  • PR #714: control/rdt: fix discovery of pod qos classes in discovery mode
  • PR #713: e2e: support distro=opensuse-tumbleweed
  • PR #712: e2e: add vm-put-pkg, install a package from host to vm
  • PR #710: e2e: fix cloud-init error on distro=debian-sid
  • PR #706: e2e: make sure tests have 'pidof' installed on fedora.
  • PR #707: e2e: fix sysctl settings that break cilium CNI on Fedora
  • PR #703: e2e: support running tests with CRI-O and cri-resmgr in NRI mode
  • PR #696: e2e: wait for cloud-init to finish during VM bootstrap.
  • PR #701: e2e: fix opensuse cloud-init and handle wrong containerd
  • PR #699: e2e: follow HTTP redirects when fetching apt repo keys.
  • PR #698: e2e: fix (EOL'd) Ubuntu Groovy image URL.
  • PR #697: e2e: allow installing cri-o from distro repos.
  • PR #694: scripts: add CRI-O support to kube-cgroups
  • PR #656: e2e: add support for k8s=X.Y.Z to set Kubernetes version
  • PR #660: docs: fix pkg urls in quick-start instructions
  • PR #690: e2e: distro=sles uses official package repositories
  • PR #689: e2e: enable reinstalling pretty much everything on VMs
  • PR #688: e2e: add support for distro=sles
  • PR #657: e2e: add an init container test
  • PR #687: edited e2e-test.md
  • PR #654: scripts: kube-cgroups prints cgroup entries per pod/container
  • PR #685: e2e: improve isolcpus test robustness
  • PR #684: e2e: clean up vm after successful reserved-resources test run
  • PR #683: e2e: blockio test for k8scri=crio and k8scri=containerd
  • PR #682: e2e: support CRI-O, containerd, and containerd + cri-resmgr as NRI
  • PR #681: e2e: cri-resource-manager configuration is optional in test suites
  • PR #680: e2e: allow templating in test suite variable files
  • PR #679: e2e: add function for checking if local binary is out-of-date
  • PR #678: e2e: change e2e test framework title
  • PR #677: e2e: support annotations in common pod templates
  • PR #676: e2e: add vm functions for dlv debugging
  • PR #675: e2e: add vm-install-runc
  • PR #674: e2e: add vm-put-docker-image to script API
  • PR #673: e2e: enable running without govm if VM_IP is set
  • PR #672: e2e: fix (remove) empty names from allowed resources printing
  • PR #671: e2e: switch k8s install source in opensuse
  • PR #670: e2e: fix reinstalling containerd on opensuse
  • PR #669: e2e: distro install crio
  • PR #668: e2e: distro: enable running fedora with cgroups=v2
  • PR #667: e2e: fix error message after installing golang from tar
  • PR #666: e2e: always install git-core with golang
  • PR #665: e2e: run apt-get install -y with default answers to dpkg
  • PR #662: e2e: Fix govm installation documentation
  • PR #663: e2e: lib: Use proper locale for bc to work
  • PR #661: e2e: require host dependencies jq and pv
  • PR #651: Basic edits to docs
  • PR #649: e2e: add goresctrl debugging support to "run.sh debug"
  • PR #648: blockio demo: fix detecting already installed cri-resmgr
  • PR #647: blockio demo: always drop caches before measuring blockio speed
  • PR #646: cache: add a directory to findContainerDir search path
  • PR #643: docs: a bunch of grammatical and stylistic fixes by DougTW.
  • PR #644: e2e: add tests for topology-aware mixed CPU allocations
  • PR #645: e2e: test topology-aware allocations with kernel isolcpus set
  • PR #642: fixes: fixes for fedora 33
  • PR #639: cgroups: add cleaned up cgroup, group control abstraction.
  • PR #641: docs: update Pygments requirements
  • PR #638: e2e: fix agent installation
  • PR #637: cri-resmgr-agent: authorize access to adjustments.
  • PR #621: e2e: fuzz topology-aware

v0.5.0

10 months ago

This release brings general stability and correctness improvements. It merges the memory tiering policy to the original topology aware one, with a number of important fixes for resource accounting and assignment.

Major Changes

  • policies:

    • Add new podpools policy for pod-granularity workload placement
    • topology-aware: merge topology-aware and memory tiering policies
    • topology-aware: honor CPU reservation/reserved CPU set in configuration
    • topology-aware: unify syntax for per container and pod annotated preferences
  • RDT:

    • split out RDT manipulation code to a self-contained package, https://github.com/intel/goresctrl
    • implement operating modes (Disabled, Discovery, Full)
    • add option to disable RDT monitoring
    • support L2 cache allocation
  • CPU allocator (used by topology-aware and podpools policies):

    • detect CPU priority levels with Intel Speed Select Technology (SST)

Bug Fixes

  • policies:

    • topology-aware: several significant cpu and memory accounting fixes
    • topology-aware: fixes in gradually relaxed memory pinning for OOM-prevention
    • topology-aware: better handling of bounding and reserved resources
    • topology-aware: fix assignment of CPU-less memory zones
    • topology-aware: fix building sparse topology trees
  • RDT:

    • use root class as a fallback for missing classes
    • empty class implies root class
    • do forceful rdt (re-)configuration
  • resource-manager:

    • force full reallocation when switching policies
    • run post-update hooks after reconfiguration
    • save cache at startup
  • config:

    • handle composite structs in Module.validate()
  • cache:

    • (over)write cache file atomically
  • testing:

    • e2e: fix clearing cri-resmgr cache on uninstall
    • e2e: properly set VM_COMPOSE_YAML when reloading existing vm-configs
  • documentation:

    • fix static-pools debug logging instructions
    • sample-configs: sample configuration fixes

Other Improvements

  • policies:

    • topology-aware: more regular annotation interpretation for CPU allocation preferences
  • resource-manager:

    • dump extra data for message disambiguation
    • flush logs after every request/event processed
  • cache:

    • log name on pod/container removal
  • cri-resmgr:

    • increase allowed service journal log bursts
  • logging:

    • switch logger to use klog
  • testing:

    • e2e: add tests for memset expansion in topology-aware policy
    • e2e: add vm-put-docker-image to the vm library
    • e2e: allow user override for VM_SSH_USER over distro-ssh-user
    • e2e: generalize templating any file with instantiate()
    • e2e: properly set VM_COMPOSE_YAML when reloading existing vm-configs
    • e2e: set imagePullPolicy on every test pod
    • e2e: support namespaced kubectl create from templates
    • e2e: unified memory-type and cold-start annotation syntax
    • e2e: update dynamic page demotion tests
    • e2e: update podpools tests to pass with new cpuallocator
    • e2e: update tests on pinning reserved CPUs
    • benchmark: add memtier_benchmark for memcached/redis
  • documentation:

    • improve RDT documentation
    • fix static-pools debug logging instructions

List of Merged PRs

  • PR #528: build: include only cri-resmgr in binary dist tarballs
  • PR #529: docs: fix static-pools debug logging instructions
  • PR #530: memtier/c4pmem4/test03-coldstart: don't jump the gun
  • PR #536: .github: update issue template for new releases
  • PR #537: docs: minor fixes in html template customization
  • PR #538: docs: use 'release branch' as the current version in versions menu
  • PR #540: e2e: support namespaced kubectl create from templates
  • PR #541: e2e: fix clearing cri-resmgr cache on uninstall
  • PR #542: e2e: generalize templating any file with instantiate()
  • PR #543: memtier: implement reserved CPUs pool
  • PR #545: resource-manager: run post-update hooks after reconfiguration
  • PR #546: go.mod: update to Kubernetes v1.19.4
  • PR #547: scripts: helper for maintaining replace lines in go.mod
  • PR #549: benchmark: add memtier_benchmark for memcached/redis
  • PR #550: test/functional: prevent read/write data race in klog
  • PR #553: docs: quote text containing '<' and '>' using `` in affinity docs
  • PR #555: scripts/update-gh-pages: more intelligent http redirect
  • PR #556: e2e: allow user override for VM_SSH_USER over distro-ssh-user
  • PR #557: Improve CPU prioritization
  • PR #560: e2e: add vm-put-docker-image to the vm library
  • PR #561: memtier: rework building of pool tree by HW topology
  • PR #562: docs: improve rdt documentation
  • PR #563: memtier/pool test: fix fd leakage causing test panics with more data
  • PR #566: Kata container support
  • PR #567: config: handle composite structs in Module.validate()
  • PR #568: control/rdt: add option to disable rdt monitoring
  • PR #570: page-migrate: add cache-like container.GetPodID()
  • PR #571: config: fix typo in log message
  • PR #572: control/rdt: fix and simplify handling of implicit disabling
  • PR #573: control/rdt: empty class implies root class
  • PR #574: control/rdt: implement assignAll()
  • PR #575: control/rdt: do forceful rdt (re-)configuration
  • PR #576: control/rdt: correct usage of checkIdle() in configNotify()
  • PR #577: control/rdt: implement operating modes
  • PR #579: memtier: don't imply error by signature for functions that never fail
  • PR #580: docs: use an explicit version of recommonmark
  • PR #581: rdt: accept missing default classes in Discovery mode
  • PR #583: docs: refer to the latest release in the installation instructions
  • PR #586: rdt: use root class as a fallback to missing classes
  • PR #587: e2e: set imagePullPolicy on every test pod
  • PR #588: memtier: unify syntax for annotated preferences
  • PR #589: memtier: fix build error introduced by improper, unrebased merging of both #524 and #543
  • PR #590: memtier: more regular annotation interpretation for CPU allocation preferences
  • PR #591: fix: nil pointer dereference on updateSharedAllocations(nil)
  • PR #592: e2e: unified memory-type and cold-start annotation syntax
  • PR #594: policy/builtin/*: fix outdated comment about PolicyName
  • PR #595: docs: recognize/handle .md-links to element IDs
  • PR #596: server,resource-manager: flush logs after every request/event processed
  • PR #597: resource-manager: rename 'memtier' policy to 'topology-aware'
  • PR #598: podpools: policy for pod-granularity workload placement
  • PR #599: rdt: fix order of params passed to GetTasksInContainer()
  • PR #600: test: drop stale rdt testdata
  • PR #601: topology-aware: improved topology tree/node dump
  • PR #602: cpuallocator: add CPU priority levels
  • PR #604: e2e: properly set VM_COMPOSE_YAML when reloading existing vm-configs
  • PR #606: Extended detection of Intel Speed Selection Technology (SST)
  • PR #607: klog: skip headers for journald by default
  • PR #608: cri-resmgr: increase allowed service journal log bursts
  • PR #609: fixes: topology-aware policy cpu/memory accounting fixes
  • PR #610: resource-manager: force full reallocation when switching policies
  • PR #612: topology-aware: force reserved/kube-system containers to the root
  • PR #613: e2e: add tests for memset expansion in topology-aware policy
  • PR #614: resource-manager,dump: dump extra data for message disambiguation
  • PR #615: topology-aware: better and more readable logs
  • PR #616: topology-aware: memory accounting and memset expansion fixes
  • PR #617: resource-manager: catch containers earlier when they are gone
  • PR #618: e2e: update podpools tests to pass with new cpuallocator
  • PR #622: topology-aware: use normal as fallback for reserved
  • PR #623: e2e: update tests on pinning reserved CPUs
  • PR #624: topology-aware: use prettyMem() in log messages
  • PR #625: cache: (over)write cache file atomically
  • PR #626: resource-manager: save cache at startup
  • PR #627: cache: log name on pod/container removal
  • PR #628: rdt: support L2 cache allocation
  • PR #629: topology-aware: fix filtering out nodes with insufficient memory
  • PR #630: topology-aware: fix moving up memory grant
  • PR #631: pkg/sysfs: clarifying comment on getCPUMapping()
  • PR #632: e2e: update dynamic page demotion tests
  • PR #634: sample-configs: make cri-resmgr-configmap.example.yaml usable
  • PR #636: podpools: fix reflect JSON tag typo

v0.4.1

1 year ago

The documentation in this release has been overhauled with significant structural improvements and additional content over previous ones. End-to-end test coverage has been vastly extended and the test framework significantly improved. This release contains a number of important bug fixes and a few other functional improvements. Here is a non-exhaustive list of these.

Bug fixes

  • agent:
    • refuse to start if NODE_NAME environment variable is not specified
  • memtier policy:
    • fix updating containers after shared pool changes
    • honor CPU isolation opt-out preference
    • honor allowed CPUs in resource discovery
    • fix PMEM-only NUMA node assignment for weird topologies
  • static-pools policy:
    • make dynamic (re-)configuration work properly
    • look for cmk isolate when parsing container command line
    • re-load legacy config on config update
    • only take pools configuration from legacy config
    • improved sanity check on pool configuration
    • fix node tainting
  • cri-resmgr:
    • fill in defaults for unspecified values in configuration

Other Improvements

  • cri-resmgr:
    • dump outbound requests if debugging is enabled for the 'cri/relay' source
  • resource controllers:
    • page-migrate: split out page-migration into a controller of its own
  • e2e test framework
    • vastly improved test coverage on multiple distros
  • builds:
    • build binary dist tarballs

Difference wrt. Rolling Master

With the exception of the PRs listed below, all others in the inclusive range #411 - #527 has been cherry-picked or back-ported from the rolling master branch to this release. The omitted PRs have been excluded due to backwards compatibility or other similar reasons:

  • #525: cri-resmgr: reuse 'rdt' logger for the split out rdt package
  • #490: rdt: use goresctrl 
  • #497: pkg/log: switch logger to use klog
  • #472: e2e: add tests for static-pools
  • #489: static-pools: slight refactoring and renaming
  • #483: static-pools: lazier node updates
  • #475: static-pools: drop all cmdline flags

v0.4.0

1 year ago

Major changes

  • 'topology-aware' policy superseded by 'memtier'
  • support for cold start of containers
  • support for dynamic demotion of memory
  • support for limiting container top tier/DRAM memory usage (require kernel support)
  • support for externally adjusting container resource assignments
  • multi-die aware resource allocation
  • binary distribution with packages for popular Linux distributions and images at Docker Hub

Detailed changelog

Policies

  • 'topology-aware' policy superseded by 'memtier', which
    • is a forked and improved version of 'topology-aware'
    • has the same basic functionality
    • has a number of improvements and extra functionality:
      • multi-die topology support
      • multi-tier (DRAM/PMEM) memory support
      • top tier/DRAM memory limiting
      • container 'cold start' support: force containers initially exclusively to PMEM
      • experimental dynamic page demotion: periodically move least-used pages from DRAM to PMEM
      • experimental support for dynamic external adjustments to container resource assignments
    • has a bunch of resource assignment/allocation fixes (which are not backported to 'topology-aware' any more)
    • will in the next release replace 'topology-aware' altogether
  • static-pools:
    • compatibility back-ports from CMK: advertise CPUs in 'shared', 'infra' pools via CMK_CPUS_SHARED, CMK_CPUS_INFRA environment variables
  • common:
    • support for new Pod annotation controls:
      • opt out from automatic topology hint generation:
        • topologyhints.cri-resource-manager.intel.com/pod: false
        • topologyhints.cri-resource-manager.intel.com/container.$name: false
      • set DRAM/top tier memory limit:
        • toptierlimit.cri-resource-manager.intel.com/pod: $limit
        • toptierlimit.cri-resource-manager.intel.com/container.$name: $limit
    • make simple container affinities always implicitly symmetric
    • limit user-defined container affinity to [-1000,1000]
    • re-trigger pod cgroupfs parent directory and QoS class discovery if necessary

Resource controllers

  • RDT:
    • remove controller-level class name mapping
    • don't consider assignment to a default class an error if no classes are defined
    • fix crash/misplaced logging of group deletion
  • Block I/O:
    • remove controller-level class name mapping
    • don't consider assignment to a default class an error if no classes are defined
  • CRI:
    • properly send out generated/queued UpdateContainerResources requests

Data collectors

  • cgroupstats:
    • use/report container IDs
    • fix hugetlb size parsing
  • avx:
    • switch to cilium/ebpf from iovisor/gobpf

cri-resmgr

  • new command line options:
    • reset cached configuration: --reset-config
    • reset cached policy data: --reset-policy
  • always set up node agent connection, even when running with --force-config
  • allow switching policies during startup, unless started with --disable-policy-switch

Packaging

  • install sample fallback config as fallback and not real config file
  • use /etc/default for defaults on debian-based distros
  • support Ubuntu 20.04, OpenSUSE 15.2

Documentation

Testing

  • end-to-end test framework added

v0.3.1

1 year ago

This v0.3.1 patch release adds packaging and build fixes on top of the v0.3.0 release.

Changes:

  • feature: add command line options for resetting the active policy in the cache and allow this to happen automatically during startup if necessary
  • fix: NUMA CPU-/memory-attachment detection code to work with older kernels
  • fix: move from gobpf to Cilium-based AVX eBPF implementation to address build issues on older kernel
  • fix: add targets for containerized cross-builds for distro packages

v0.3.0

1 year ago
  • added memory-tiering policy: topology-aware policy with support for DRAM, PMEM (Intel Optate DC) and HBM (High Bandwidth Memory) allocation
  • added blockio controller: class-based control over block I/O using the cgroupfs blkio controller
  • added support for metrics collection:
    • collection of raw metrics data, exporting to Prometheus
    • AVX512 usage: collect per container AVX512 instruction usage, tag containers accordingly
  • rdt controller improvements: disjoint partitioning, L3 and memory bandwidth monitoring, and Intel RDT metrics
  • new annotations:
    • assign full pod or a container to block I/O or RDT class:
      • rdtclass.cri-resource-manager.intel.com/container.$container: class-name
      • rdtclass.cri-resource-manager.intel.com/pod: class-name
      • blockioclass.cri-resource-manager.intel.com/container.$container: class-name
      • blockioclass.cri-resource-manager.intel.com/pod: class-name
    • memtier policy preference for type of memory allocated to a container:
      • memory-type.cri-resource-manager.intel.com: $container: [dram,][pmem,][hbm]

v0.2.0

2 years ago

Implement a more general, unified mechanism for handling runtime configuration.

v0.1.0

2 years ago

Initial release for the project with major functionality available in alpha state.

Note: this is pre-production Alpha release. Not for production use!