Kubespray Versions Save

Deploy a Production Ready Kubernetes Cluster

v2.20.0

1 year ago

Deprecation / Removal

  • Drop Ansible support for v2.9 and v2.10 (#8925, @oomichi)
  • Drop support for Fedora 34 (#8967, @floryut)

Feature / Major changes

  • Add Rocky Linux 8 support (#8905, @oomichi)
  • Add Kylin Linux support. (#9078, @ErikJiang)
  • Add Fedora36 support (#8967, @floryut)
  • Add 'flush ip6tables' task in reset role (#9168, @GreatLazyMan)
  • Add tar in common required package (#9184, @yankay)
  • Add support for NTP configuration. (#9027, @yankay)
  • Increase ansible fact_caching_timeout (from 2 to 24 hours) (#9059, @rptaylor)
  • Add kubelet systemd service hardening option kubelet_systemd_hardening: [true|false] (#9194, @alegrey91)
  • Support timezone setting (#9263, @yankay)
  • Update deprecated ansible include syntax (#9040, @boeto)
  • Update etcd download url in offline.yml to use arch (#8943, @ErikJiang)
  • Add Support for Rewrite Plugin to CoreDNS/NodelocalDNS (#9245, @eifelmicha)
  • Add SeccompDefault admission plugin for kubelet (using new variable kubelet_seccomp_default) (#9074, @alegrey91)
  • Add an optional extra_groups parameter for k8s_nodes (e.g. to configure calico route reflector nodes on Openstack using the calico_rr group) (#9211, @rptaylor)
  • Add arm64 Flatcar OS's pypy bootstrapping support (#8959, @kerryeon) (see Notes 1)
  • Add docker support for Kylin distributions (#9144, @ErikJiang)
  • Add hashes for Kubernetes 1.24.3 , v1.22.12, v1.23.9 (#9092, @marcofortina)
  • Add ingress nginx webhook (#9033, @liupeng0518)
  • Add manage-offline-files.sh to collect necessary files and provides http file download service for offline deployment. (#8956, @ErikJiang)
  • Add missing configuration for extra tolerations (#8908, @smasset)
  • Add support for node & pod pid limits (in kubelet-config file) (#9038, @h9-HSFRQDH)
  • Add the option to enable default Pod Security Configuration (#9017, @Foxlik)
  • Add unsafe_show_logs switch to show more log details (default to false, same as previous behavior) (#9164, @ErikJiang)
  • Add variables (delete_node_retries,delete_node_delay_seconds) to tweak remove node process (#9096, @ydFu)
  • Added 'avoid-buggy-ips' support of MetalLB (metallb_avoid_buggy_ips for default IP address pool and avoid_buggy_ips for additional IP address pools defined in metallb_additional_address_pools) (#9166, @kerryeon) (see Notes 2)
  • Adjust the default value of calico blockSize ipv4 to 26, and ipv6 to 122. (#9055, @cyclinder)
  • Make kubernetes owner parametrized (using kube_owner/kube_cert_group/etcd_owner variables) (#8952, @alegrey91)
  • Move old etcd backup removal after etcd restart, to prevent removing backup if etcd fail (#9147, @emiran-orange)
  • Supports reserve ephemeral-storage (#8895, @Thearas)
  • [dev/docs] add support for pre-commit hook (#9158, @cristicalin)
  • [etcd] Etcd role won't run on all nodes everytime. (#9173, @liupeng0518)
  • [etcd] add 3.5.4 and drop 3.5.1 and 3.5.2 (#9021, @cristicalin)
  • [infra] bump pause container to 3.6 (#9024, @cristicalin)
  • Update Kubernetes dashboard to 2.6.0 (k8s 1.24 support) (#8906, @floryut)
  • [kubernetes] make 1.24.x the new default (#8935, @cristicalin)
  • [kubernetes] drop support for 1.21.x (#8935, @cristicalin)
  • [kubernetes] drop support for deprecated dynamic_kubelet_configuration (#8935, @cristicalin)
  • [offline] Archive offline-files and env NO_HTTP_SERVER to skip Nginx container running. (#9068, @yjqg6666)
  • Adds support for multiple architectures to yq (#9288, @ErmalKristo)
  • Add variable to tweak the vsphere-csi namespace (vsphere_csi_namespace) (#9278, @MahdiAbbasi95)
  • Ensure ping package is installed on the system (#9284, @yankay)
  • Add more functionalty to DNS configuration (#9270, @eminaktas)
  • Ensure ostree variable has been defined for fcos (#9321, @electrocucaracha)
  • Support removing options in resolvconf with tab separator (#9304, @2k0ri)
  • preinstall: Add nodelocaldns to supersede_nameserver if enabled (#9282, @azuwis)

Network

  • [Calico] calico rr now supports multiple groups (#9134, @liupeng0518)
  • [Calico] drop support for 3.19.x and 3.20.x
  • [Calico] Make Calico CNI log path configurable and allow disabling this log (#8921, @fungusakafungus)
  • [Calico] The NAT (nat_outgoing) would not be disabled automatically when enabling peer_with_router. (#9255, @kerryeon)
  • [Calico] The variable calcio_ipam_autoallocateblocks has been renamed to calico_ipam_autoallocateblocks (#9056, @liupeng0518)
  • [Calico] calico-typha metrics port are now exposed when metrics are enabled (#8855, @vjacynycz)
  • [Calico] Add Wireguard support for Rocky Linux 9 (#9287, @krystianmlynek)
  • [Calico] The parameter name calcio_rr_id Is renamed to calico_rr_id for fixing a typo ⚠️ (#9327, @kerryeon)
  • [Canal] update templates to work again with both etcd and k8s datastore (#9113, @floryut)
  • [Cilium] Add list/watch nodes rules to cilium-operator clusterrole. (#9178, @Thearas)
  • [Cilium] Add support for the updated (startup|liveness|readiness)Probe.Port numbers (#9031, @tomberget)
  • [Cilium] Update cilium to v1.11.7 (#9119, @dkhachyan)
  • [Cilium] Make rolling-restart readiness wait delay and count configurable via cilium_rolling_restart_wait_retries_{count, delay_seconds} (#9176, @Tristan971)
  • [Cilium] Upgrades cilium to 1.11.6 and add some default variables. (#9065, @eminaktas) (See Notes 3)
  • [Cilium] Update Cilium default to 1.12.x (#9225, @necatican) (See Notes 5)
  • [Cilium] Dropped support for < v1.10.0 (#9225, @necatican)
  • [Cilium] cilium_ip_masq_agent_enable variable no longer exists. Use enable-ipv4-masquerade and enable-ipv4-masquerade to enable masquerade. (#9225, @necatican)
  • [flannel] update to v1.18.1 & make it default (#9104, @mzaian)
  • [flannel] update to v1.19.2 & make it default (#9296, @mzaian)
  • [Kube-vip] Fail if kube_proxy_strict_arp is set to false in arp mode (#9223, @yankay)
  • [Multus] Support multi-architecture installation (#9012, @cyclinder)

Applications

  • [Openstack] Add option to use default deny firewall policy and port allowlisting on UpCloud (#9058, @Ajarmar)
  • [Openstack] Fix subnet order and number of master nodes (#9159, @robinelastisys)
  • [Metallb] Renamed matallb_auto_assign variable to metallb_auto_assign (users disabling 'auto-assign' in metallb must update the variable name) (#8949, @orange-llajeanne)
  • [vSphere-csi] Add nodeAffinity to daemonset using vsphere_csi_node_affinity variable (#9293, @dmitrytretyakov)
  • [upcload-csi] Bump driver version to v0.3.3 (#9317, @robinAwallace)

Container-Managers

  • [containerd] add hashes for 1.5.12, 1.5.13, 1.6.5 and 1.6.6, make 1.6.6 the new default (#8980, @cristicalin)
  • [containerd] Add LimitMEMLOCK parameter configuration in containerd.service (using containerd_limit_[proc_num/core/open_file_num/mem_lock) (#9269, @ErikJiang)
  • [containerd] Remove duplication in containerd template (#9301, @fungusakafungus)
  • [containerd] Allow configuring base_runtime_spec per containerd runtime and supply a default runtime spec (#9302, @fungusakafungus)
  • [Docker] use cri-dockerd instead of dockershim by default
  • [Docker] Enable cri-dockerd service to prevent issue with reboot (#9201, @mostafaghadimi)
  • [cri-o] Add dpkg hold for apt installs (#9075, @SamuelBECK1)
  • [cri-o] add support for 1.24.x required by kubernetes 1.24.x (#8935, @cristicalin)
  • [runc] update versions for 1.1.x and drop 1.0.x (#9022, @cristicalin)
  • [runc] Variable containerd_default_runtime is now undifined by default (but default to runc) (#9026, @rptaylor)
  • [crun] add 1.4.5 and drop 1.2 and 1.3 (#9023, @cristicalin)
  • [nerdctl] upgrade to 0.20.0 (#8980, @cristicalin) then 0.22.2 (#9180, @panpan0000)

Bug or Regression

  • Fix failure to look up user etcd when adding a user (#9016, @yankay)
  • Fixing setting up kubespray on Azure with CSI drivers. (#9153, @wayfrro)
  • Add --supervisor-fss-namespace=kube-system flag to vcloud-csi installation (#9066, @yasintahaerol)
  • Add assertion for IPv4 check in verify settings (to allow IPv6 deployments) (#8946, @Citrullin)
  • Add calico-kube-controllers missing verbs (#9032, @ghostloda)
  • Allow "openSUSE Tumbleweed" to be run (again) (#9072, @oomichi)
  • Apply calico bgp peer definition task to all nodes (#8974, @orange-llajeanne)
  • Create snapshot namespace only when needed (#9014, @robinAwallace)
  • Disable kubelet_authorization_mode_webhook by default (#9238, @cristicalin)
  • Disabled DNSStubListener for Flatcar Linux (#9160, @kerryeon)
  • Do not run etcd role in scale.yml playbook when etcd installed by kubeadm (#9210, @LuckySB)
  • Fix Hetzner CCM cluster-cidr (wrongly set to a static value) (#9127, @ym)
  • Fix calicoctl.sh path error when getting calico configuration (#9217, @tasekida)
  • Fix failing tasks when calico_datastore is set to etcd (#9228, @chadswen)
  • Fix missing quote in task "See if node is schedulable" (#9146, @emiran-orange)
  • Fix number node name can't be added. (#9266, @cleverhu)
  • Fix regex for replacing http_proxy host in RedHat Subscription Manager (#8957, @dicksontung)
  • Fix some docker reset task (don't remove already uninstalled packages, ignore error on remove docker config files if already removed) (#8966, @orange-llajeanne)
  • Fix the Centos/RHEL docker installation issue in ARM64 (#9047, @yankay)
  • Fix the kube-vip missed SAN issue (#9099, @yankay)
  • Fixed concatenate str & int in auto_renew_certificates_systemd_calendar (#8979, @floryut)
  • Fixes the issue when it cannot correctly set the namespace for vphere-csi-driver (#9046, @eminaktas)
  • Fixes vSphere CSI for vSphere CSI >= 2.4.0 on vSphere 6.7U3 (#8944, @snowball77)
  • No more errors are emitted when attempting to delete worker nodes that do not exist. (#9244, @kerryeon)
  • Optimize the format of evictionHard in kubelet-config.yaml template (#9204, @shelmingsong)
  • Remove kubeowner different than root condition for user creation (#9125, @alegrey91)
  • Remove unneeded socat wrapper installation for Flatcar (#8970, @kerryeon) (See Notes 4)
  • Set fallback value of kubelet ip6 (#8926, @kerryeon)
  • Swap calico download url, as the old primary url was deprecated and artefact no longer published (#8920, @sathieu)
  • Upgrade the nginx-proxy and haproxy image version , and use the alpine base image (#9100, @yankay)
  • Variable kube_pid_reserved must be a string (#9124, @liupeng0518)
  • [Docker] Add restart of docker.service during install (#9205, @krystianmlynek)
  • [Kube-ovn] Value check for HW_OFFLOAD is now correctly handle (and will no longer always be false) (#9218, @floryut)
  • [ingress-nginx] Fix ingress-nginx RBAC rules when deployed classless (#9156, @cristicalin)
  • Remove the 'etcd-unsupported-arch' args to fix the etcd issue in arm64 (#9049, @yankay)
  • Fix duplicate field in ingress-nginx template (#9285, @cloud-66)
  • Fix CoreDNS memory leak issue by adding max_concurrent=1000 in the CoreDNS config (#9307, @yankay)
  • Fix ansible user module create_home property (erroneously written as createhome) (#9314, @liupeng0518)
  • Hotfix containerd restart not properly restarting (#9322, @fungusakafungus)

Other (Cleanup or Flake)

  • [CI] upgrade vagrant image for opensuse leap to 15.4 (#9175, @cristicalin)
  • [CI] test upgrade with defaults (containerd) instead of docker (#8980, @cristicalin)
  • [CI] Fix cloud_init files for different distros (#9232, @floryut)
  • git ignore .terraform.lock.hcl in all folders (#9109, @rptaylor)

Component versions:

  • Core
    • Kubernetes v1.24.6
    • Etcd v3.5.4
    • Docker v20.10
    • Containerd v1.6.8
    • CRI-O v1.24
  • Network
    • CNI-plugins v1.1.1
    • Calico v3.23.3
    • Cilium v1.12.1
    • Flannel v0.19.2
    • Kube-ovn v1.9.7
    • Kube-Router v1.5.1
    • Multus v3.8
    • Weave v2.8.1
    • kube-vip v0.4.2
  • App
    • Cert-manager v1.9.1
    • CoreDNS v1.8.6
    • Nginx-ingress v1.3.1
    • krew v0.4.3
    • argocd v2.4.12
    • helm v3.9.4
    • metallb v0.12.1
    • registry v2.8.1

Known issues

  • Host network might broke when an interface goes down (Cilium 1.12/Ubuntu 22.04), please read Note 5.
  • If bin_dir value is changed to something other than /usr/local/bin, containerd configuration might need to be tweak, please check #9243

Notes

  1. Upgrading the bootstrap pypy may cause some unexpected behaviors for Flatcar use-cases)
  2. As the newly added feature uses the default value of MetalLB as same, there is no side effect for users who do not change it's value
  3. This PR also implements cgroup auto-mount. By default, it is enabled. You can disable it by adding cgroup_auto_mount: false. Moreover, you can enable or disable BPF with these variables cilium_enable_bpf_masquerade and cilium_enable_host_legacy_routing
  4. Some old (<2020Y) 'Flatcar Container Linux by Kinvolk' may not be supported.
  5. With Cilium 1.12/Ubuntu 22.04, you might run into this issue, workaround are available while the issue is resolved on cilium end.

v2.19.1

1 year ago

Feature

  • Add missing configuration for extra tolerations (#8999, @smasset)

Bug or Regression

  • Allow "openSUSE Tumbleweed" to be run (again) (#9072, @oomichi)
  • Disable kubelet_authorization_mode_webhook by default (#9239, @cristicalin)
  • Do not run etcd role in scale.yml playbook when etcd installed by kubeadm (#9210, @LuckySB)
  • Fix failing tasks when calico_datastore is set to etcd (#9234, @chadswen)
  • Set fallback value of kubelet ip6 (#8942, @chinnonae)
  • Swap calico download url, as the old primary url was deprecated and artefact no longer published (#8920, @sathieu)

v2.18.2

1 year ago

Feature

  • Add missing configuration for extra tolerations (#9000, @smasset)

Bug or Regression

  • Disable auth webhook default (#9240, @cristicalin)
  • Fix cert-manager unusable due to leader election namespace problem (#8681, @rtsp)
  • Removed quotation at nerdctl_extra_flags (#8699, @oomichi)
  • Run 0100-dhclient-hooks only if dhcpclient is enabled (#8658, @oomichi)
  • Fix image_command_tool var ignored since PR #8601 (#8684, @sathieu)

v2.19.0

1 year ago

Announcements

We are looking for maintainers, reach out in #5432.

Deprecation / Removal

  • [metrics server] Remove addon-resizer from image list (no longer in use) (#8566, @cyril-corbon)
  • Add kubeadm option to etcd_deployment_type to replace the etcd_kubeadm_enabled variable (#8317, @necatican) (See Notes 3)
  • Removes runc-arm64-1.0.3 hash value for non existing binaries (#8391, @Payback159)
  • Drop containerd 1.4 support (#8780, @oomichi)

Feature / Major changes

  • Add hashes for Kubernetes 1.24.0, 1.24.1, 1.21.12, v1.21.13, 1.22.8, 1.22.9, v1.22.10, 1.21.11, 1.23.5, 1.23.6, v1.23.7 and make kubernetes v1.23.7 default (#8628, #8746, #8783, #8876, #8760, @mzaian, @cristicalin)
  • Add youki runtime support to CRI-O (#8411, @electrocucaracha)
  • [etcd] add 0 hash for arm v3.5.2 to prevent deployment failures (#8651, @cristicalin)
  • [etcd] ensure etcd is properly upgraded when managed by kubeadm (#8722, @cristicalin)
  • [etcd] Add etcd_max_request_bytes option to set the request size limit of etcd (#8849, @necatican)
  • [etcd] add v3.5.1 for kubernetes 1.22+ (#8588, @mzaian)
  • [etcd] Added node label to etcd metrics (#8475, @fungusakafungus)
  • [Cilium] Update Cilium manifests and the default version to v1.11.3 (#8717, @necatican)
  • [Cilium] Add identity_allocation_mode support (#8430, @necatican)
  • [Cilium] Change Cilium setting identity_allocation_mode to cilium_identity_allocation_mode (#8519, @tomberget) (see Notes 1)
  • [cilium] Add the cilium ip-masq-agent configuration support (#8893, @mahjonp)
  • [docker] add support for cri-dockerd as a replacement for dockershim (#8623, @cristicalin)
  • Add dual-stack support to kubelet --node-ip parameter, it works if set ip6 option host vars (#8542, @kakkotetsu)
  • Add ppc64le support (#8505, @mgiessing)
  • Add runc v1.1.0 hash values to support multi-arch installation. (arm64, amd64) (#8447, @Payback159)
  • Add support for EventRateLimit plugin configuration (#8711, @alegrey91)
  • Add support for including annotations on aws-ebs-csi-controller (#8779, @dlouks)
  • Add support for kube-vip (#8669, @sathieu)
  • Add support for service-account-lookup parameter (using kube_apiserver_service_account_lookup) (#8781, @alegrey91)
  • [ansible] add support for ansible 5 (ansible-core 2.12) (#8512, @cristicalin)
  • [ansible] make ansible 5.x the new default version (#8660, @cristicalin)
  • [ansible] update ansible and cryptography requirements (#8826, @cristicalin)
  • [cert-manager] Update cert-manager to 1.6.1 (#8377, @electrocucaracha)
  • [cert-manager] Update cert-manager to v1.7.2 (#8648, @rtsp)
  • [cert-manager] Upgrade to v1.8.0 (#8688, @rtsp)
  • Add Ubuntu 22.04 support (#8841, #8795, #8754, @u2216, @arno01, @oomichi)
  • Add evictionHard parameter to kubelet config (variables: eviction_hard/eviction_hard_control_plane) (#8421, @cyril-corbon)
  • Add hcloud as external cloud provider (#8440, @oujonny)
  • Add kube_router_cluster_asn option to set ASN number of the cluster (#8837, @rosskusler)
  • Add option to use UpCloud's preconfigured server plans, firewalls and managed load balancers (upgrade to 2.4.0 from 2.0.0) (#8758, @Ajarmar)
  • Add possibility to remove ippools from cni config (#8845, @tomcsi)
  • Add the ability to set tolerations (cert_manager_tolerations), nodeselector (cert_manager_nodeselector) and affinity (cert_manager_affinity) in cert-manager templates (#8389, @cyril-corbon)
  • Add the possibility to use UpCloud csi-driver Add the possibility to use ansbile_host as api ip for localhost kubeconfig (#8653, @robinAwallace)
  • Add Hardening setup guide (#8868, @alegrey91)
  • Add variables to manage kubelet parameters (kubelet_streaming_connection_idle_timeout / kubelet_make_iptables_util_chains) (#8796, @alegrey91)
  • Added the optional prompt or delay before uncordoning nodes after upgrades (see variable upgrade_node_post_upgrade_confirm). (#8530, @mac-chaffee)
  • Allow installation of a cluster using external CAs (kubernetes-ca, etcd-ca, kubernetes-front-proxy-ca) (#8620, @julienlefur)
  • Allow the customization of snapshot controller namespace using snapshot_controller_namespace (#8305, @liupeng0518)
  • Allow to change cert-manager leader election namespace for GKE Autopilot support (#8424, @rtsp)
  • Allow to choose image pull commands based on container manager or override them (#8380, @sathieu)
  • Allow to specify CA data for webhooks (using kube_webhook_token_auth_url_skip_tls_verify / kube_webhook_token_auth) (#8777, @dlouks)
  • Assert that IP range is enough for the nodes (#8720, @eakyildirim)
  • Bastion support now works for remove-node.yml (#8504, @roedie)
  • Bump upcloud csi-driver to v0.2.1 (#8784, @robinAwallace)
  • Change default kube_encryption_algorithm to "secretbox" (#8574, @Payback159) (See Notes 2)
  • Explicit container_manager variable for Etcd hosts (#8521, @vi7)
  • Improve first_kube_control_plane variable management to avoid installation failures due to variable overlapping (#8388, @unai-ttxu)
  • Improve offline script generate_list.sh using ansible (#8538, @tmurakam)
  • [ingress-nginx] upgrade to 1.2.1
  • Ingress controllers and external provisioners (respectively deployed via ingress_controller and external_provisioner roles meta dependencies) are now upgraded in upgrade-cluster.yml (#8640, @mirwan)
  • Local volume provisioner tolerations removed by default. (#8805, @spaced)
  • Replace CLB with NLB for kube-apiserver domain in Terraform AWS contrib code (#8578, @sophalHong)
  • Split kube_feature_gates variable for different kubernetes components (#8677, @alegrey91) (See Notes 4)
  • Helm-apps role for installing helm charts (#8347, @VannTen)
  • Upgrade azuredisk csi to v1.10.0 (#8432, @cyril-corbon)
  • Upgrade metrics-server to v0.5.2 and remove NET_BIND_SERVICE capabilities (#8338, @cyril-corbon) (See Notes 5)
  • Vagrant: new var $ansible_verbosiity was introduced for setting up ansible verbosity level (#8639, @maciejaszek)
  • [CI] Move from CentOS 8 to AlmaLinux 8 for kubespray CI, therefore CentOS 8 is no longer tested (#8297, @cristicalin)
  • [CI] split molecule testes to run in parallel (#8756, @cristicalin)
  • [container image] use focal (ubuntu 20.04) base image for our docker builds (#8631, @cristicalin)
  • [coredns] Allow overriding the default CoreDNS zone's cache plugin configuration via the coredns_default_zone_cache_block variable (#8488, @Tristan971)
  • [csi-snapshotter] upgraded to 5.0.0 (#8403, @cristicalin)
  • [download] add capability to specify alternative download mirrors for files (#8474, @cristicalin)
  • [mitogen] update to 0.3.2 (#8470, @cristicalin)
  • [reset] remove containerd storage during reset (#8469, @cristicalin)
  • [sysctl] set fs.may_detach_mounts=1 to address pods stuck in Terminating state (#8635, @cristicalin)

Network

  • [Calico] upgrade calico to 3.19.4, 3.20.4 and 3.21.4 (default) and add 3.22.0 experimental support (#8544, @cristicalin)
  • [Calico] add 3.22.1 (#8612, @cristicalin)
  • [Calico] Add calico apiserver (using calico_apiserver_enabled variable) (#8690, @liupeng0518)
  • [Calico] Add support for IP6_AUTODETECTION_METHOD using new variable calico_ip6_auto_method (#8541, @kakkotetsu)
  • [Calico] upgrade default calico version to v3.22.3 (#8897, @germetist)
  • [Calico] Add configurable ipam strictaffinity (using calico_ipam_strictaffinity param) (#8581, @eyenx)
  • [Calico] Change the calico cni name from cni0 to k8s-pod-network by default (#8813, @cyclinder)
  • [Calico] Fix Wireguard support for CentOS Stream 9/RHEL 9 Beta (#8625, @ThisIsQasim)
  • [Calico] fix calico-kube-controllers verbs (#8847, @irizzant)
  • [calico] Some commands only need to be run once (#8833, @liupeng0518)
  • [calico] call calico checks early on to prevent altering the cluster with bad settings and causing traffic outages (#8707, @cristicalin)
  • [calico] make calico 3.21.x the news default and drop 3.18.x (#8426, @cristicalin)
  • [calico] switch default iptables backend detection to Auto (#8429, @cristicalin)
  • [calico] Use vxlan instead of ipip as the default calico encapsulation mode. This change impacts existing deployments that don't explicitly set the encapsulation mode and will need to set calico_ipip_mode: Always and calico_network_backend: bird to avoid the upgrade process breaking. (#8434, @cristicalin)
  • [calico] upgrade default calico version to v3.21.5 (#8745, @mzaian)
  • [calico] Use ipamconfig instead of calico ipam command (#8839, @liupeng0518)
  • [calico] don't clobber calico options set by the user (#8815, @cristicalin)
  • [flannel] Use install-cni-plugin to fit upstream (#8714, @zhengtianbao)
  • [kube-ovn] Sync some feature with upstream (#8790, @liupeng0518)
  • [kube-ovn] The network plug-in kube-ovn does not require a cluster to allocate podcidr (#8454, @chenhuazhong)

Applications

  • Instance customization via cloud init for openstack VMs deployed by terraform is now available. (#8394, @moss2k13) (See Notes 6)
  • [MetalLB] Configure PriorityClassName for deployment (#8362, @unai-ttxu)
  • [MetalLB] Improve validation conditions for BGP Peers (#8568, @kakkotetsu)
  • [MetalLB] Upgrade metallb to v0.11.0 and add liveness and readiness probe (#8420, @cyril-corbon)
  • [MetalLB] Allow to put node selectors and source address for each metallb peers (#8534, @hightoxicity)
  • [MetalLB] Added MetalLB BGP peer password authentication option. (#8792, @Oogy)
  • [MetalLB] Add images to downloads (#8715, @sathieu)
  • [MetalLB] Fix wrong port name in metallb.yml.j2 (metrics not monitoring) (#8510, @binkoni)
  • [OpenStack] Allow disabling port security in terraform contrib code (#8410, @cristicalin)
  • [OpenStack] Updated openstack cloud controller to version v1.22.0 (#8629, @Xartos)
  • [OpenStack] Create master nodes with for_each for openstack. Makes it easier to switch out master nodes via terraform. (#8709, @robinAwallace)
  • [OpenStack] Fixed cluster roles for openstack cloud controller (#8638, @Xartos)
  • [OpenStack] Fix templating of ansible_ssh_common_args in no_floating.yml if used as TF module (#8646, @frittentheke)
  • [OpenStack] allow disabling port security at port level (#8455, @cristicalin)
  • [vSphere] Terraform code will need var.vapp when a vapp is referenced (vsphere_hostname is also removed) (#8441, @ceesios)
  • [vsphere_csi] update to 2.5.1 and make external_vsphere_version 7.0u1 the default (#8676, @cristicalin)
  • Terraform/gcp: Allow to change extra disk types (#8524, @sathieu)
  • Terraform/gcp: Allow to use preemptible VM instances (using two new variable master_preemptible and worker_preemptible) (#8480, @sathieu)
  • Terraform/gcp: Do not create unused subnetworks terraform/gcp: Upgrade to latest google provider (#8497, @sathieu)
  • [Terraform AWS] Add tag to AWS VPC subnets for automatic subnet discovery (#8705, @sophalHong)
  • [terraform] use modern day equinix metal provider (#8748, @cristicalin)

Container-Managers

  • Check & uninstall container engine if needed (when changing container engine defined) (#8439, @cyril-corbon)
  • [Docker] Add epoch to docker-ce and docker-ce-cli packages to ensure docker upgrade (on rhel based) (#8618, @unai-ttxu)
  • [containerd] make containerd_insecure_registries into a dict similar to containerd_registries (#8340, @mircyb) (See Notes 7)
  • [containerd] upgrade versions to fix CVE-2022-23648 (#8597, @cristicalin)
  • [containerd] Upgrade containerd to 1.6.0 and re-enable arm architecture with default options [runc] make 1.1.0 the default [nerdctl] upgrade to 0.17.0 (#8555, @cristicalin)
  • [containerd] add hashes for 1.15.11 and 1.6.2 and make 1.6.2 the default (#8671, @cristicalin)
  • [containerd] Update containerd to 1.5.9 (#8402, @electrocucaracha)
  • [containerd] Fix containerd image download bug (#8894, @liupeng0518)
  • [containerd] nerdctl insecure registry support (#8339, @mircyb)
  • [containerd] Ensure containerd service unmasking (#8726, @rickerc)
  • [cri-o] Update configuration of registries in cri-o (#7852, @bsloeserwij) (See Notes 8)
  • [cri-o] add cri-0 1.23.x (#8599, @cristicalin)
  • [crun] update to 1.4 and drop pre-1.x versions (#8330, @cristicalin)
  • [crun] upgrade to 1.4.3 (#8598, @cristicalin)

Bug or Regression

  • Add ETCD_EXPERIMENTAL_INITIAL_CORRUPT_CHECK flag to etcd config (default to true) (#8664, @floryut)
  • Add with_networks variable to external_hcloud_cloud in ansible playbook and network_zone variable to Hetzner Cloud Terraform. (#8702, @Anthony-Bible)
  • Allow replacement of address prefixes for all images (#8764, @ErikJiang)
  • CRI-O: fix unqualified-search registries (#8496, @krystianmlynek)
  • Change libvirt default disk controller from IDE to SCSI (#8656, @190ikp)
  • Do not remove package in validate container engine role when FCOS (#8626, @LuckySB)
  • Enable Kubespray deployment on vagrant (#8697, @oomichi)
  • Enable several read-only tasks in check mode (#8584, @tjanson)
  • Ensure all Kubelet required kernel values are configured when enabling protectKernelDefaults (#8692, @unai-ttxu)
  • Ensure taint configuration for secondary control-plane acting both as control-plane and node (#8363, @unai-ttxu)
  • Error: error parsing jsonpath {, unclosed action (#8683, @emiran-orange)
  • Fix DNS configuration when using resolvconf_mode='host_resolvconf' during scale (#8361, @unai-ttxu)
  • Fix GCP PVC creation on k8s v1.22 (#8616, @lmercl)
  • Fix 0090-etchosts file when setting override_system_hostname=false (#7634, @liupeng0518)
  • Fix kube-dns service will no longer be deleted if not created by kubespray (#8565, @cyril-corbon)
  • Fix an issue the kube-vip manifest with extra space. (#8831, @yankay)
  • Fix an issue users cannot skip redhat registration by specifying -e rhel_enable_repos=False (#8871, @gleb108)
  • Fix an issue where offline script could not output URLs of both containerd and krew. (#8379, @oomichi)
  • Fix condition on kata_containers_version/kube_version check when kata_containers_enabled is false (#8804, @emiran-orange)
  • Fix container engine still installed on dedicated etcd node even if etcd_deployment_type: host (#8386, @rtsp)
  • Fix cri-o packages install for Rocky 8 (#8594, @brankomijuskovic)
  • Fix etcd certificates reference to support etcd_kubeadm_enabled: true (#7766, @forselli-stratio)
  • Fix imageRepository path for CoreDNS (ensure coredns repository namespace is kept) (#8572, @nicolas-goudry)
  • Fix incorrect condition type (#8822, @cyclinder)
  • Fix incorrect leader election namespace with cert-manager leading to insufficient permission (#8433, @rtsp)
  • Fix issue when PodSecurityPolicy is enabled static pods are now mirrored earlier by kubelet. Problem when installing HA etcd via kubeadm. (#8744, @robinAwallace)
  • Fix kubectl call before installing it when setting first_kube_control_plane/joined_control_planes (#8412, @floryut)
  • Fix kubelet_kubelet_cgroups_cgroupfs pointing incorrectly to slice (#8500, @fungusakafungus)
  • Fix print_hostnames of inventory.py (#8554, @oomichi)
  • Fix remove-node.yaml playbook fails when host is unreachable (#8843, @oomichi)
  • Fix removing docker-ce.repo failed (#8856, @Thearas)
  • Fix the condition of drain on pre-remove task (#8634, @oomichi)
  • Fix typo and duplicated declaration of ingressclasses (#8591, @spaced)
  • Fix vagrant default value for parameters local_path_provisioner_enabled/multi_networking (#8650, @liupeng0518)
  • Fix wrong item in mitogen contrib (#8508, @kdszoom)
  • Fixed a bug where hosts with NetworkManager enabled were having their /etc/resolv.conf file edited directly instead of through NM. Fixed a bug where DNS lookup failures would cause reset.yml or scale.yml to error out when resolvconf_mode=host_resolvconf (#8575, @mac-chaffee)
  • Fixed a bug where updated versions of etcd weren't being applied. Check your etcd instances to make sure their versions are what you expect. If not, restarting all etcd members should apply the update. (#8556, @mac-chaffee)
  • Fixed a bug where upgrade-cluster.yaml would not apply updates to etcd-events (#8550, @mac-chaffee)
  • Fixes missing checksum for kata-containers 2.2.3 on arm architectures (#8383, @Payback159)
  • Fixes the etcd node removal by pointing ETCDCTL_ENDPOINTS to localhost (127.0.0.1) (#8526, @roedie)
  • Prevent removing etcd member when running in check mode (#8570, @fungusakafungus)
  • Removal flow: Waiting until Volumes will be detached from the node (#8739, @rocko-n)
  • Removed quotation at nerdctl_extra_flags (#8695, @T-Eberle)
  • Run 0100-dhclient-hooks only if dhcpclient is enabled (#8658, @oomichi)
  • Update verbs for volumeattachments resource (#8731, @moule3053)
  • Use correct service name for coredns when cleanup (#8811, @weizhoublue)
  • [Terraform-AWS] Fix error when creating subnets more than AZ (#8516, @sophalHong)
  • [cert-manager] Fix missing RBAC rules for ClusterRole cert-manager-cainjector (#8444, @onock)
  • [containerd] avoid cleanup of /usr/bin on ostree distributions (#8624, @cristicalin)
  • [reset] fix task inclusion logic for network plugin (#8727, @cristicalin)
  • [systemd-resolved] Fix DNS early and late stages (dns_early|dns_late) of cluster deployment [systemd-resolved] Add upstream_dns_servers to FallbackDNS [cluster-reset] Revert DNS configuration to early stage (for instance: only defined upstream nameservers) (#8561, @onock)
  • Incompatible ipset protocol version (7) included in kube-proxy since k8s 1.23, is causing issue with fedora kernel. (#8397, @floryut) (See Notes 9)

Other (Cleanup or Flake)

  • Add IPv6 listen directive to nginx if enable_dual_stack_networks (#8596, @kakkotetsu)
  • Cleanup crictl configuration file during reset (#8569, @jayonlau)
  • Remove check_mode: no from gen_certs_script.yml to prevent changing files (#8573, @fungusakafungus)

Component versions:

  • Kubernetes v1.23.7
  • Etcd v3.5.3
  • Docker v20.10
  • Containerd v1.6.4
  • CRI-O v1.23
  • CNI-plugins v1.1.1
  • Calico v3.22.3
  • Cilium v1.11.3
  • Flannel v0.17.0
  • Kube-ovn v1.9.2
  • Kube-Router v1.4.0
  • Multus v3.8
  • Weave v2.8.1
  • Ceph-provisioner v2.1.0-k8s1.11
  • Cert-manager v1.8.0
  • CoreDNS v1.8.6
  • Nginx-ingress v1.2.1

Known issues

n/a

Note

  1. If the Cilium setting identity_allocation_mode has been overridden locally, it needs to be changed to cilium_identity_allocation_mode.
  2. If you are already using encrypting secret a rest and have not set the kube_encryption_algorithm flag, then you must set kube_encryption_algorithm to aescbc since the default value has changed to the more secure secretbox standard.
  3. etcd_kubeadm_enabled is deprecated. You can set etcd_deployment_type to kubeadm to get the same behaviour."
  4. This add some variables that could be defined by the user. It doesn't introduce a breaking change because the previous unique variable (kube_feature_gates) still works as expected, new variables are: kube_apiserver_feature_gates/kube_controller_feature_gates/kube_scheduler_feature_gates/kube_proxy_feature_gates/kubelet_feature_gates
  5. Be careful as the port has now moved to 4443
  6. To use it, adjust contrib/terraform/openstack/modules/compute/templates/cloudinit.yaml before deployment. (Currently it uses one cloud init for all instances.)
  7. containerd_insecure_registries needs to be updated or won't work anymore!
  8. Update the configuration of cri-o registries to only use the crio_registries key
  9. Calico and K8S 1.23 might broke under some OS, please see additional details in PR

v2.18.1

2 years ago

Feature / Major changes

  • [kubernetes] Update kubernetes hashes and make 1.22.6 the default (#8467, @cristicalin)
  • Allow to choose image pull commands based on container manager or override them (#8380, @sathieu)
  • Improve offline script generate_list.sh using ansible (#8606, @tmurakam)
  • [CI] Move from CentOS 8 to AlmaLinux 8 for kubespray CI, therefore CentOS 8 is no longer tested (#8297, @cristicalin)
  • [container image] use focal (ubuntu 20.04) base image for our docker builds (#8631, @cristicalin)
  • [sysctl] set fs.may_detach_mounts=1 to address pods stuck in Terminating state (#8635, @cristicalin)

Container-Managers

  • [containerd] make containerd_insecure_registries into a dict similar to containerd_registries (#8340, @mircyb) (see Notes 1)
  • [containerd] nerdctl insecure registry support (#8339, @mircyb)

Bug or Regression

  • Fix an issue where offline script could not output URLs of both containerd and krew. (#8379, @oomichi)
  • Fix container engine still installed on dedicated etcd node even if etcd_deployment_type: host (#8404, @rtsp)

Notes

  1. containerd_insecure_registries needs to be updated or won't work anymore

v2.18.0

2 years ago

Announcements

We are looking for maintainers, reach out in #5432.

Deprecation / Removal

  • [Ambassador] Remove code, ci and ansible tags as it's no longer maintained and not working anymore. (#8086, @floryut)
  • Drop support for Fedora 33 (#8246, @floryut)
  • Remove ovn4nfv support (#8265, @floryut)
  • Mitogen: support for the mitogen playbook accelerator is now deprecated in preparation of ansible upgrades, please clean up your playbooks that depend on it. (#8147, @cristicalin)
  • Remove registry-proxy of container registry (#8327, @zhengtianbao)

Feature / Major changes

  • Replace docker with containerd as the default container_manager (#8175, @cristicalin)
  • Add ArgoCD as a kubernetes-app, using the new argocd_enabled variable (#7895, @atorrescogollo)
  • Add ServiceTypes support to container registry (using new variables registry_service_type, registry_service_clusterIP, registry_service_loadBalancerIP, registry_service_annotations, registry_service_nodePort) (#8291, @zhengtianbao)
  • Add TLS and authentication support to container registry (using new variables registry_tls_secret, registry_htpasswd, registry_config) (#8229, @zhengtianbao)
  • Add a new option cert_manager_trusted_internal_ca to specify trusted internal ca of cert_manager. (#8135, @infra-monkey)
  • Add a new option metrics_server_resizer (default to false) to control the addon-resizer container deployment in metrics-server pod (#8018, @oomichi)
  • Add an optional fallback to node drain during cluster upgrades using --disable-eviction flag (#8094, @utkuozdemir)
  • Add capability to use node swap with kubernetes 1.22+ (using new variable kubelet_fail_swap_on, default to true) (#8241, @cristicalin)
  • Add possibility of automation creation of Load Balancers on Google Compute Engine (#8179, @lmercl)
  • Add support for Fedora 35 (#8234, @floryut)
  • Add support for Rocky Linux (#8095, @ooraini)
  • Add support for cgroups v2 (no more reverting to cgroups v1 for Fedora) (#8237, @cristicalin)
  • Add the ability to skip some phases in the kubeadm join_phase using kubeadm_join_phases_skip (#8067, @necatican)
  • Added terraform support for Hetzner Cloud (#8053, @Xartos)
  • Allow to scrape etcd metrics using a service (#8203, @sathieu)
  • Default DNS replica count is now set to the minimum value between 2 and the length of k8s_cluster inventory group. (#8112, @smasset)
  • Determine root filesistem device and partition before running growpart (allowing to not always be sda1) (#8024, @mlorenzo-stratio)
  • Ensure apparmor is installed on Ubuntu (#8036, @rtsp)
  • Fail metrics-server installation when addon-resizer is used on a platform different than amd64 (#8144, @zhengtianbao)
  • Krew: upgrade to v0.4.2 (#8168, @zhengtianbao)
  • Move deprecated kube_feature_gates from kebelet args to kubelet config (#8048, @fungusakafungus)
  • Multiple Ansible versions are now supported (2.9/2.10/2.11) and tested by CI (#8172, @cristicalin)
  • Prefer nodelocaldns as dns server over coredns when defined (#7731, @Alvaro-Campesino)
  • Python 2.7: revive python2.7 support on EL7, note that this is not properly exercised in CI. (#8192, @cristicalin)
  • Remove Terraform 0.14/0.15 support and CI -> Add TF 1.x (#8062, @floryut)
  • Support Python 3.10 - ruamel.yaml.clib need to be updated to 0.2.4 (#8034, @olivierlemasle)
  • Update Netchecker to v1.2.2 - now local etcd backend is needed to run (#8074, @cristicalin)
  • Update registry template with additional options (security context and proves) and variables (registry_storage_access_mode to changes access mode, registry_replica_count for replicas) (#8198, @zhengtianbao)
  • [nodelocaldns] add the capability to hot swap nodelocaldns without causing DNS blackholes during the swap (#8100, @cristicalin)
  • Add Ingress support to container registry (using new variables registry_ingress_annotations, registry_ingress_host, registry_ingress_tls_secret) (#8311, @zhengtianbao)

Applications

  • [cinder-csi] Add new variable cinder_csi_rescan_on_resize to control rescan-on-resize option (#8057, @reneluria)
  • [cinder-csi] Added variable cinder_tolerations that sets tolerations for cinder-csi-nodeplugin DaemonSet (no tolerations by default) (#8137, @Ajarmar)
  • [cinder-csi] Update version to support Kubernetes 1.22 and up (#8296, @StevenReitsma)
  • [Metallb] Allow changing metallb default pool name (var metallb_pool_name) (#8111, @damjanek)
  • [Metallb] Allow setting 'auto-assign' property to 'false' for default IP pool (var matallb_auto_assign) (#8193, @IKRozhkov)
  • [Openstack] Fix a bug where Openstack cloud provider could not be used with username/password (#8021, @bl0m1)
  • [Openstack] Replaces the global use_server_groups with the option to enable and set server group policy for each of the master, etcd, and node server groups respectively. (#8046, @OlleLarsson) (see Notes 2)
  • [Openstack] Adds the option to set boot volume type for k8s nodes (using node_volume_type variable) (#8256, @robinAwallace)
  • [Openstack] Use a pre-existing floating IP for bastion node, instead of creating a new one. (#8214, @feber)
  • [nginx-ingress] Nginx controller now also watch kind:ingress without class (#8128, @LuckySB)
  • [vSphere-CSI] Update to 2.4.0 (#8295, @cristicalin)
  • [vSphere] Terraform code now documents and requires specification of the OVF template to use and separate specification of the netmask to use. (#8178, @llarsson)

Network

  • [Calico] Add support for BGPPeer sourceAddress (#8306, @kakkotetsu)
  • [Calico] Reduced calico bird route removal time on large clusters to less than one minute improving Kubernetes node removal performance (#8227, @khatrig)
  • [Calico] Bump 3.21.x to 3.21.2 (#8275, @cristicalin)
  • [Calico] Add support for container ip forwarding setting, using new variable calico_allow_ip_forwarding (#8184, @zhengtianbao)
  • [Calico] Add vxlanEnabled spec in FelixConfiguration to prevent calico network (when using vxlan) from crashing after upgrading the cluster (#8167, @devinjeon)
  • [Calico] Check if 'plugins' key exists in calico_cni_config object allowing user to add nodes using both playbooks (#7717, @dlouks)
  • [Calico] Fix Kube-bench security warnings on calico controller (file ownership/permissions) (#8072, @oomichi)
  • [Calico] Fix typha prometheus causing a deployment error (#8005, @ericlake)
  • [Calico] Increase CPU limit to prevent throttling (#8076, @olevitt)
  • [Calico] Increase node probe timeouts and add calico_node_readinessprobe_timeout/calico_node_livenessprobe_timeout to tune them (#7981, @cristicalin)
  • [Calico] Make calico_min_version check relevant (#7939, @cristicalin)
  • [Calico] Make calico 3.20.x the default release and drop support for calico 3.17.x (#7984, @cristicalin)
  • [Calico] When default pool already exists and calico_pool_blocksize is defined in inventory, the assertion on blocksize equality wrongly fails because a string cast is missing (#8321, @emiran-orange)
  • [Cilium] During upgrades, wait for cilium pod to be ready before uncordoning node, add new option upgrade_post_cilium_wait_timeout to control that (By default 120 seconds) (#7978, @reneluria)
  • [Cilium] Fix operator metrics activation (enable-metrics key missing) (#8000, @L3o-pold)
  • [Weave] Allow EXTRA_ARGS to be configured for weave-npc, using weave_npc_extra_args (#8140, @brainfair)
  • [Weave] Update template to match upstream (#8013, @frankfil)
  • [ovn4nfv] Move crd API to v1, update crd spec (#8006, @floryut)

Container-Managers

  • Container engine is no longer installed on separate etcd nodes when using etcd_deployment_type: host (#7532, @VannTen)
  • [Docker] When using containerd_manager==docker (default config) you will now need to use docker_containerd_version to change the containerd version instead of the established containerd_version (#8130, @cristicalin)
  • [Kata-Containers] Update versions 2.2.0 (new default) and 2.1.1 (bugfix replacing 2.1.0). (#8017, @cristicalin)
  • [Kata-Containers] add support for version 2.3.0 (needs kubernetes 1.22.0+) (#8276, @cristicalin)
  • [containerd] Add the hashes for containerd version 1.4.12 and 1.5.8 and makes 1.5.8 the new default. (#8239, @cristicalin)
  • [containerd] upgrade versions 1.4.11 and 1.5.7 and make 1.4.11 the default (#8129, @cristicalin)
  • [containerd] Add support for SuSE distributions (#8261, @cristicalin)
  • [containerd] Download containerd from upstream instead of using distro specific packages (#7970, @cristicalin)
  • [containerd] Allow 'stable' and 'edge' ContainerD values on validation (#8020, @electrocucaracha)
  • [containerd] Ensure pulling, exporting and importing images for the target platform when dealing with multi-platform images to avoid partial import issues (#8245, @cristicalin)
  • [containerd] Fix the usage of cgroupfs with containerd and introduce cgroupsfs specific variables (⚠️ containerd_runtimes is now containerd_additional_runtimes ) (#8123, @pasqualet)
  • [containerd] Moved containerd and runc from /usr/bin to bin_dir (defaults to /usr/local/bin) - Fixing install for FCOS (#8107, @mafn)
  • [containerd] Switch default resolvconf_mode to host_resolvconf (#8247, @cristicalin)
  • [containerd] Insecure registry support (#8298, @Morion-Self)
  • [cri-o] Add support for cri-o user namespaces (#8268, @nmasse-itix)
  • [cri-o] Enable experimental modules when rpm-ostree version >= 2021.9 (#8202, @zhengtianbao)
  • [gVisor] Update gVisor to 20210921 release (#8015, @cristicalin)
  • [runc] upgrade to v1.0.3 and add arm64 (#8274, @cristicalin)

Bug or Regression

  • Add gather facts to remove-node playbook to prevent issue with os evaluation (#8231, @IKRozhkov)
  • Add missing 'stable' and 'edge' keys in docker_cli_versioned_pkg dict (#8019, @electrocucaracha)
  • Add missing proxy settings for subscription-manager in RHEL OS (if http_proxy is defined) (#8012, @oomichi)
  • Change dns upstream condition for coredns (use upstream dns even whern resolveconf_mode is set to docker_dns) (#8263, @toplordsaito)
  • Change etcd-events listen port (2381 -> 2383) to avoid conflicts (#8232, @zhengtianbao)
  • DeprecationWarning occurs when indentfirst=None is specified in coredns-config.yml.j2 (#8224, @Ishizuka427)
  • Fix CentOS7 issue with allowPrivilegeEscalation value from metrics-server (#8014, @oomichi)
  • Fix Heketi deployment logic that was broken by the ansible 3.4 upgrade (#8118, @cristicalin)
  • ~Fix apiserver_loadbalancer_domain_name pointing to external LB instead of dbip (#8299, @singeleaf)~ [REVERTED]
  • Fix a conflict with containerd and podman under CentOS 8.x (remove podman when installing Docker/Containerd) (#8016, @panpan0000)
  • Fix bad indentation in cert-manager when trusted internal ca is defined (#8314, @infra-monkey)
  • Fix calico's inventory check (Check if inventory match current cluster configuration) conversion (#8120, @juliohm1978)
  • Fix cert_manager ClusterIssuer manifest by removing deprecated ClusterIssuer (#8064, @rtsp)
  • Fix cloud_provider check in preinstall task, allowing oci value (and removing deprecated ones) (#8164, @oomichi)
  • Fix containerd failed to start if apparmor is not installed (#8011, @rtsp)
  • Fix debian 9 check for apt cache update in bootstrap-os (#8215, @floryut)
  • Fix deploying loadbalancer to masters when bind-address is not set to 0.0.0.0 (and loadbalancer_apiserver_localhost is true) (#8262, @Bledai)
  • Fix forgotten update of etcd-servers list in apiserver manifest when scaling (#8253, @liupeng0518)
  • Fix k8s-certs-renew cp path wrongly using /usr/bin/ (#7992, @lazybetrayer)
  • Fix k8scsi/csi-resizer repo (from gcr to quay) (#8270, @oomichi)
  • Fix kata-containers runtime with version 2.x (#8068, @cristicalin)
  • Fix kubespray flatcar ansible_os_family and ansible_distribution for backward compatibility (#8029, @isantospardo)
  • Fix quorum check when recovering broken etcd cluster (with etcd 3.5.x) (#8126, @floryut)
  • Fix reset playbook for Fedora OS (#8205, @cristicalin)
  • Fix wrong baseurl for centos extra repo for Oracle Linux (missing /os/) (#8208, @buker)
  • Fixes incongruence between metrics-server resources limits/requests defined in official templates (#8088, @irizzant)
  • [Calico] Fix support for version 3.21.x (#8250, @cristicalin)
  • [Calico] add missing verbs in ClusterRole (#8136, @krystianmlynek)
  • Fix resolved config when nodelocaldns is not enabled (#8351, @liupeng0518)

Other note worthy changes

  • Add auto completion for krew addon (#8171, @zhengtianbao)
  • Added Ubuntu 21.04 (hirsute) in restart network task (reset role) (#8134, @seungjinyu)
  • Limit kubectl delete node to k8s nodes and not etcd (#8101, @VannTen)
  • NetworkManager tasks can now be run with ansible check_mode (#8133, @Isakgicu)
  • Remove comparison of kubelet_shutdown_grace_period and kubelet_shutdown_grace_period_critical_pods (#7993, @cristicalin) (see Notes 1)
  • Replace deprecated --delete-local-data in pre-remove/pre-upgrade tasks (#8081, @mzaian)
  • Replace path_join (in reset role) to support Ansible 2.9 (#8160, @zhengtianbao)
  • Update local-volume-provisioner image from quay to k8s.gcr (#8054, @foxdalas)
  • Use kube_config_dir for kubeconfig instead of hard path in multiple plays (#7996, @oomichi)
  • Add Glusterfs daemonset readiness and liveness params (and increase initial_delay_seconds to 10 seconds) (#8309, @zemkogabor)
  • Simplify usage of pre-remove role (#8334, @VannTen)

Component versions:

  • Kubernetes v1.22.5
  • Etcd 3.5.0
  • Docker 20.10
  • Containerd 1.5.8
  • CRI-O 1.22
  • CNI-plugins v1.0.1
  • Calico v3.20.3
  • Cilium 1.9.11
  • Flannel 0.15.1
  • Kube-ovn 1.8.1
  • Kube-Router 1.3.2
  • Multus 3.8
  • Weave 2.8.1
  • CoreDNS 1.8.0
  • Nodelocaldns 1.21.1
  • Helm 3.7.1
  • Nginx-ingress 1.0.4
  • Cert-manager 1.5.4
  • Kubernetes Dashboard v2.4.0

Known issues

n/a

Notes

  1. This PR removes the comparison of kubelet_shutdown_grace_period to kubelet_shutdown_grace_period_critical_pods because ansible cannot do time interval comparisons sanely so we defer to the better judgement of the deployer.
  2. The terraform variable use_server_groups is no more, please use master_server_group_policy/node_server_group_policy and etcd_server_group_policy

v2.17.1

2 years ago

Major changes

  • Update kubernetes version to 1.21.6 (#8142, @oomichi)
  • Add a new option metrics_server_resizer (default to false) to control the addon-resizer container deployment in metrics-server pod (#8018, @oomichi)
  • Add an optional fallback to node drain during cluster upgrades using --disable-eviction flag (#8102, @utkuozdemir)
  • Ensure apparmor is installed on Ubuntu (#8036, @rtsp)
  • Default DNS replica count is now set to the minimum value between 2 and the length of k8s_cluster inventory group (#8109, @smasset)

Applications

  • [Openstack] Fix a bug where Openstack cloud provider could not be used with username/password (#8021, @bl0m1)

Network

  • [Calico] Check if 'plugins' key exists in calico_cni_config object allowing user to add nodes using both playbooks (#7717, @dlouks)
  • [Calico] Fix typha prometheus causing a deployment error (#8005, @ericlake)
  • [Calico] Increase node probe timeouts and add calico_node_readinessprobe_timeout/calico_node_livenessprobe_timeout to tune them (#7981, @cristicalin)
  • [Cilium] Fix operator metrics activation (enable-metrics key missing) (#8000, @L3o-pold)

Bug or Regression

  • Add missing proxy settings for subscription-manager in RHEL OS (if http_proxy is defined) (#8012, @oomichi)
  • Fix CentOS7 issue with allowPrivilegeEscalation value from metrics-server (#8014, @oomichi)
  • Fix k8s-certs-renew cp path wrongly using /usr/bin/ (#7992, @lazybetrayer)
  • Fix containerd failed to start if apparmor is not installed (#8011, @rtsp)

Other note worthy changes

  • Use kube_config_dir for kubeconfig instead of hard path in multiple plays (#7996, @oomichi)

v2.17.0

2 years ago

Announcements

We are looking for maintainers, reach out in #5432.

Deprecation / Removal

  • Drop support for Fedora 32 (#7657)

Major changes

  • Add support for Fedora 34 (#7657)
  • Add Debian 11 (bullseye) support (#7853)
  • Enable Graceful Node Shutdown for Kubernetes >= 1.21.0 (#7746)
  • Move to Ansible 3.x by default (#7672) (see Notes 1)
  • Set selinux type t_etc if selinux state is enforcing (#7791)
  • Add Infomaniak to compatible public clouds list (#7910)
  • During pre-upgrade add a flag to always cordon (#7892) (see Notes 2)
  • Update Terraform 0.15 to tf validated and tested versions (#7927)
  • Feature DynamicKubeletConfig is deprecated in 1.22 and will not move to GA (#7938) (see Notes 5)
  • Inventory builder can now add IP to inventory (#7583) (see Notes 6)
  • Add a new option kubeadm_upgrade_auto_cert_renewal to control certificates renewal during control plane upgrade (#7976)

Applications

  • [Openstack] Openstack cloud config: store cloud.conf and API CA cert in k8s secret and avoid writing them to disk (#7603)
  • [vSphere] vSphere credentials can now be passed as environment variables (#7646)
  • [vSphere] Update vSphere CPI ClusterRole according to the latest official CPI manifests (#7838)
  • [vSphere] Add suport of Vsphere CSI driver 2.2.X versions (#7848)
  • [Cinder] Add cinder_csi_ignore_volume_az (#7624)
  • [Cinder] Added support for application credentials for cinder-csi (#7799)
  • [Cinder] Added support for sourcing application credentials from environment variables (#7799)
  • [MetalLB] Update to v0.10.2 (#7925)
  • [MetalLB] Update default variable: keep nodeSelector in one place (#7931)
  • [CSI] Update CSI snapshotter and allow enabling it stand-alone (#7943)
  • [nginx-ingress] Bump to 1.0.0 to support kube 1.22 (#7942) (see Notes 3) (see Notes 4)
  • [UpCloud] Updated terraform script to use private network and dynamic additional disks (#7779)

Container managers

  • [Kata-container] Replace deprecated 1.x version of Kata containers with the new 2.x (#7670)
  • [gVisor] Add initial support for gVisor container runtime (#7661)
  • [CRI-O] Allow cri-o offline install (#7777)
  • [CRI-O] Add cri-o to support secure/insecure registry authentication (#7837)
  • [Containerd] Enable containerd on Fedora CoreOS (#7794)
  • [Containerd] Add containerd on Flatcar Container Linux (#7681)
  • [Containerd] Add containerd secure/insecure registry authentication support (#7868)

Network

  • [Calico] Add support for Calico 3.19.1 (#7630)
  • [Calico] Add retries to 'Set label for route reflector' task (#7645)
  • [Calico] Support enabling the eBPF dataplane for Calico (#7618)
  • [Calico] Add Wireguard support (#7638)
  • [Calico] Use --allow-version-mismatch in calicoctl.sh to allow upgrades (#7873)
  • [Calico] kube_service_addresses_ipv6 is now added to serviceClusterIPs if enable_dual_stack_networks is true (#7944)
  • [Cilium] Add cilium_operator_api_serve_addr to cilium operator config (#7901)

Other note worthy changes

  • Add nodeSelector for other services and node labels before CNI setup (#7613)
  • Allow deployers to limit the interface on which nodelocaldns exposes its prometheus listening port (#7748)
  • Ubuntu changed package name python-apt to python3-apt (#7769)
  • Retry to fetch binary if it fails first time (#7839)
  • Remove environment variable in remove-node play (#7729)
  • addons/cert_manager: Retries until webhook pods has been created (#7850)
  • Add tags: always to all included service playbook (#7906)
  • Use --no-cache-dir flag to pip in dockerfiles to save space (#7898)

Component versions:

  • Kubernetes v1.21.5
  • Etcd 3.4.13
  • Docker 20.10
  • Containerd 1.4.9
  • CRI-O 1.21
  • CNI-plugins v0.9.1
  • Calico v3.19.2
  • Cilium 1.9.10
  • Flannel 0.14.0
  • Kube-ovn 1.7.2
  • Kube-Router 1.3.0
  • Multus 3.7.2
  • ovn4nfv v1.1.0
  • Weave 2.8.1
  • CoreDNS 1.8.0
  • Nodelocaldns 1.17.1
  • Helm 3.6.3
  • ambassador: v1.5
  • Nginx-ingress 1.0.0
  • Cert-manager 1.0.4
  • Kubernetes Dashboard v2.3.1

Known issues

  • Ubuntu-16 won't work with default containerd version (1.4.9) as packages are not available, please use 1.4.6

Notes

  1. Users need to uninstall ansible 2.9 to be able to install on top ansible 3.x which was split between ansible-base and ansible-collections.
  2. Setting roles/upgrade/pre-upgrade/defaults/main.yml:upgrade_node_always_cordon to true causes a node to be drained before an upgrade and uncordoned after an upgrade even if the node is not cordoned when the upgrade begins.
  3. Ingress-nginx: upgrade to 1.0.0 with stable ingress API, this version requires explicitly setting kubernetes.io/ingress.class: nginx on managed ingresses
  4. ⚠️ nginx-ingress 1.0 does not support networking.k8s.io/v1beta
  5. Flag --dynamic-config-dir has been deprecated, Feature DynamicKubeletConfig is deprecated in 1.22 and will not move to GA. It is planned to be removed from Kubernetes in the version 1.23. Please use alternative ways to update kubelet configuration.
  6. The dynamic inventory builder will by default overwrite the inventory config. This was previously unintended behavior. In order to add new hosts into the already existing inventory config use the add command e.g. $ inventory.py add 10.0.1.8

v2.16.0

2 years ago

Announcements

We are looking for maintainers, reach out in #5432.

Deprecation / Removal

  • Remove contrib/vault (Outdated since 2018) (#7400)
  • Drop support for calico version 3.15.x (#7545)

Major changes

  • Replace inventory group kube-master with kube_control_plane (#7256) (see Notes 5)
  • Move kubernetes/master to kubernetes/control-plane (#7218) (see Notes 1)
  • Move recover_control_plane/master to control-plane (#7236) (see Notes 2)
  • Replace KUBE_MASTERS with KUBE_CONTROL_HOSTS (#7257) (see Notes 3)
  • Rename ansible groups to use _ instead of - (#7552) (see Notes 7)
  • Add AlmaLinux support (#7538)
  • Add terraform support for Exoscale (#7141)
  • Add terraform support for Vsphere (#7306)
  • Add terraform support for UpCloud (#7360)
  • Support for CentOS 8 and derivatives is considered stable (#7615)
  • Support dual stack IPv4 & IPv6 networking (#6859)
  • Auto renew control plane certificates (#7358) (see Notes 4)
  • Add auto_renew_certificates_systemd_calendar to configure when K8S certificates renewal runs (#7490)
  • Specify runAsGroup, allow safe sysctls by default (#7399)
  • Add KubeSchedulerConfiguration for k8s 1.19 and up (#7351) (see Notes 6)
  • Add script for generate download files and images list (#7561)
  • Terraform 0.12+ is now required to run scripts under contrib/terraform/aws (#7576)
  • Allow using ansible 2.10.x to deploy Kubespray (#7600)
  • Add a contrib playbook (os-manage) to disable service firewall for Kubespray development and test (#7431)

Applications

  • [Krew] Add krew support (#7464)
  • [Openstack] Make sure worker rules is applied on workers (#7279)
  • [Openstack] Write openstack controller manifests with correct perms (#7284)
  • [Openstack] Allow users to set image_uuid instead of name, this allows the use of openstack community images (#7283)
  • [Openstack] Use image id instad of name (#7293)
  • [Openstack] Update Cinder CSI driver to v1.20.0 (#7280)
  • [Openstack] Add most_recent = true while retrieving the latest image (#7376)
  • [Openstack] Add external_openstack_enable_ingress_hostname option for external-openstack-cloud-controller-manager (#7572)
  • [Metallb] Introduces optional tolerations and nodeSelector for metallb components (controller and speaker) (#7334)
  • [CSI] Add suport of Vsphere CSI driver 2.X versions (#7480)
  • [External-Provisioner] Add new variable "local_volume_provisioner_use_node_name_only" to configure local volume provisioner "useNodeNameOnly" option (#7421)

Container managers

  • [CRI-O] Add experimental cri-o support for Amazon Linux 2 (#7353)
  • [CRI-O] Add support for configuring cri-o pids_limit (#7525)
  • [CRI-O] Fix support for cri-o on OracleLinux and add support for AlmaLinux (#7541)
  • [Containerd] Fix reset.yml failing when using containerd (#7308)
  • [Containerd] Add privileged_without_host_devices support (#7343)
  • [Containerd] Update config.toml to V2 and set default runtime to io.containerd.runc.v2 and cgroup to systemd (#7398)
  • [Containerd] Add containerd_extra_args (#7461)
  • [Containerd] Add nerdctl cli tool for containerd users (#7500)
  • [Containerd] Add support for Amazon Linux 2(#7595)
  • [Docker] docker_dns_servers_strict had different default values, the default is now the same everywhere: false (#7499)
  • [Docker] Add enablerepo: amzn2extra-docker to allow docker installation on Amazon linux (#7507)
  • [crun] Update and changed the default crun version to v0.19 (#7433)
  • [crictl] Change the owner of /etc/crictl.yaml to root (#7254)

Network

  • [Calico] Fixup check when ipipMode / vxlanMode is not present (#7195)
  • [Calico] Support for dual stack (IPv4 & IPv6) network deployment using Calico is introduced as an opt-in feature (#6859)
  • [Calico] Add option to use calico with azure when using calico in vxlan (#7300)
  • [Calico] Download Calico KDD CRDs (#7372)
  • [Calico] Add the ability to customize calico's bird port, via calico_bird_listen_port variable (#7419)
  • [Calico] Add new variable calico_node_startup_loglevel to configure CALICO_STARTUP_LOGLEVEL (Default to error) (#7530)
  • [Calico] Allow specifying overriding BGP peer name (#7591)
  • [Calico] Enables Calico serviceAccount token monitoring and update of /etc/cni/net.d/calico-kubeconfig if need be (#7586)
  • [Calico] Add support to advertise MetalLB allocated IPs through Calico when using Calico 3.18 and greater (#7593)
  • [Cilium] Allow cilium to be deployed with transparent encryption (#7342)
  • [Cilium] Add cilium_ipam_mode variable (#7418)
  • [Cilium] Move cilium kvstore settings to configmap (#7462)
  • [Cilium] Update Cilium documentation and overall update of cilium role (#7521)
  • [Ambassador] Add ingress_ambassador_multi_namespace setting, allows Ambassador operator to watch all namespaces for AmbassadorInstallation CRD resources (#7516)
  • [Flannel] Add image_arch in image tag (#7560)

Other note worthy changes

  • Added the ping_access_ip variable to enable(default)/disable ping test during preinstall (#7020)
  • Rework proxy support (#7095)
  • Remove ignore_errors from drain tasks and enable retires (#7151)
  • Add other masters sequentially, not in parallel (#7166)
  • Add 2 variables for upgrade, to prompt (upgrade_node_confirm, default false) and delay (upgrade_node_pause_seconds, default 0 seconds) (#7168)
  • Change node-role.kubernetes.io from master to control-plane (#7183)
  • Add retries to drain during upgrade. Allow leaving nodes cordoned after drain failure. Allow continuing upgrade if drain fails (#7227)
  • Vagrantfile: always recreate inventory symlink (#7245)
  • Updated etcd cert check tasks to detect when new cert gen is required (#7219)
  • Only use stat get_checksum: yes when needed (#7270)
  • Match on os-release ID / VARIANT_ID (#7269)
  • Fix issue with kubeadm when *_PROXY variables are present in the environment (#7275)
  • Kubespray now ignores *_PROXY vars found in your environment and only uses proxy configuration from the inventory (#7309)
  • Facts.yaml: reduce the number of setup calls by ~7x (#7286)
  • Fixup kubelet.conf to point to kubelet-client-current.pem (#7347)
  • Check for dummy kernel module (#7348)
  • Disable gather_facts for correctly work via bastion (#7265)
  • Add etcd max snapshot and wals (#7382)
  • Add cryptography module installation (#7404)
  • Allow connecting to bastion via non-standard SSH port (#7396)
  • Remove local lb privileged securityContext (#7437)
  • Regenerate apiserver.crt on all controle-plane nodes when needed instead of just the first one (#7463)
  • Check if python netaddr is installed and if Jinja is recent enough (#7486)
  • Add ingress controller ingress-class var (#7522)
  • Update Dockerfile to reduce Kubespray image size (#7556)
  • Change kubeadm coredns addon images name to coredns/coredns (#7570)
  • Allow usage of jinja2_native=True (#7612 / #7606)

Component versions:

  • Kubernetes v1.20.7
  • Etcd 3.4.13
  • Docker 19.03
  • Containerd 1.4.4
  • CRI-O 1.20
  • CNI-plugins v0.9.1
  • Calico v3.17.4
  • Cilium 1.8.9
  • Flannel 0.13.0
  • Kube-Router 1.2.2
  • Multus 3.7
  • Kube-ovn 1.6.2
  • Weave 2.8.1
  • CoreDNS 1.7.0
  • Nodelocaldns 1.17.1
  • Helm 3.5.4
  • Nginx-ingress 0.43.0
  • Cert-manager 1.0.4
  • Kubernetes Dashboard v2.2.0

Known issues

  • Ansible 2.11 is not supported and using it will results in errors
  • Using Docker container engine could prompt "PLEG IS NOT HEALTHY" error, due to a runc bug, please see this issue for more information.

Notes

  1. The role kubernetes/master has been renamed to kubernetes/control-plane, if using the role kubernetes/master solely on previous Kubespray, it is necessary to update the specified role.
  2. The role recover_control_plane/master has been renamed to recover_control_plane/control-plane. If using the role recover_control_plane/master solely on previous Kubespray, it is necessary to update the specified role.
  3. inventory_builder starts referring the environment variable KUBE_CONTROL_HOSTS to get the number of control-plane nodes, it still refers KUBE_MASTERS but it will be not referred after some deprecation cycles. Please specify KUBE_CONTROL_HOSTS if now specifying KUBE_MASTERS
  4. You can enable control plane certificates automatic renewal using auto_renew_certificates, or manually use k8s-certs-renew.sh force_certificate_regeneration is removed as it was only renewing the api server certs and not all the other ones
  5. The inventory group kube-master has been renamed to kube_control_plane. Please update your inventory file by replacing kube-master if continuing to use the existing inventory file.
  6. New vars for configuring kube-scheduler were introduced (including extenders and profiles). Default vaules can be found at roles/kubernetes/control-plane/defaults/main/kube-scheduler.yml
  7. Ansible groups were updated to be more consistent with dynamic inventory plugins: k8s-cluster -> k8s_cluster / kube-node -> kube_node / calico-rr -> calico_rr / no-floating -> no_floating

v2.15.1

3 years ago

This release includes the following changes (among other things):

  • Set Kubernetes default version to v1.19.9
  • Remove local lb privileged (#7454)
  • Check kube-apiserver up on all masters before upgrade (#7217)
  • Check for dummy kernel module (#7348)
  • containerd,docker: stop installing extras repo on CentOS/RHEL
  • Calico: fixup check when ipipMode / vxlanMode is not present
  • Update azure cloud config (#7221)
  • roles/docker: Make repokey fingerprint overrideable (#7263)
  • Adding other masters sequentially, not in parallel (#7166)
  • calico: fix NetworkManager check (#7169)
  • Remove ignore_errors from drain tasks and enable retires (#7151)
  • Correct Jinja Syntax for etcd-unsupported-arch (#6919)
  • Fix unintended SIGPIPEs. (#7214)
  • Fix: Bastion undefined variable (#7227)
  • Ensure when use_oracle_public_repo is set to false the public Oracle
  • Fix ansible calico route reflector tasks in calico role (#7224)
  • Run containerd related tasks on OracleLinux. (#7250)
  • Remove deletion of coredns deployment. (#7211)
  • Fix Restart network doesn't work on Fedora CoreOS (#7271)
  • Only use stat get_checksum: yes when needed (#7270)
  • Fixup cri-o metacopy mount options (#7287)
  • Ensure kubeadm doesn't use proxy (#7275)
  • Ensure we gather IPv6 facts
  • Add privileged_without_host_devices support (#7343)
  • Auto renew control plane certificates (#7358)
  • Fix k8s-certs-renew for k8s < 1.20 (#7410)
  • Fixup kubelet.conf to point to kubelet-client-current.pem (#7347)
  • Fix "api is up" check (#7295)
  • Fix remove-node by removing jq usage (#7405)
  • Fix reset when using containerd (#7308)
  • Fix proxy usage when *_PROXY are present in environment (#7309)
  • Fix the filename </etc/vault> is Duplicate in the reset role. (#7313)
  • Fix recover-control-plane undefined 'proxy_disable_env' variable (#7326)
  • Fix: added string to bool conversion for use_localhost_as_kube api load balancer (#7324)