Certified Kubernetes Administrator (CKA) Exam Preparation
It specifies a map of key-value pairs. For the pod to be eligible to run on a node, the node must have each of the indicated key-value pairs as labels (it can have additional labels as well).
kubectl label nodes <node-name> <label-key>=<label-value>
apiVersion: v1 kind: Pod metadata: name: nginx labels: env: test spec: containers: - name: nginx image: nginx imagePullPolicy: IfNotPresent nodeSelector: disktype: ssd
Interlude: built-in node labels
kubernetes.io/hostname
failure-domain.beta.kubernetes.io/zone
failure-domain.beta.kubernetes.io/region
beta.kubernetes.io/instance-type
beta.kubernetes.io/os
beta.kubernetes.io/arch
The key enhancements are
- the language is more expressive (not just “AND of exact match”)
- you can indicate that the rule is “soft”/”preference” rather than a hard requirement, so if the scheduler can’t satisfy it, the pod will still be scheduled
- you can constrain against labels on other pods running on the node (or other topological domain), rather than against labels on the node itself, which allows rules about which pods can and cannot be co-located
Node affinity is conceptually similar to
nodeSelector
– it allows you to constrain which nodes your pod is eligible to schedule on, based on labels on the node. There are currently two types of node affinity, called
requiredDuringSchedulingIgnoredDuringExecution
preferredDuringSchedulingIgnoredDuringExecution
The “IgnoredDuringExecution” part of the names means that, similar to how
nodeSelector
works, if labels on a node change at runtime such that the affinity rules on a pod are no longer met, the pod will still continue to run on the node.
spec: affinity: nodeAffinity: requiredDuringSchedulingIgnoredDuringExecution: nodeSelectorTerms: - matchExpressions: - key: kubernetes.io/e2e-az-name operator: In values: - e2e-az1 - e2e-az2 preferredDuringSchedulingIgnoredDuringExecution: - weight: 1 preference: matchExpressions: - key: another-node-label-key operator: In values: - another-node-label-value
The new node affinity syntax supports the following operators:
In
,NotIn
,Exists
,DoesNotExist
,Gt
,Lt
.
If you specify both
nodeSelector
andnodeAffinity
, both must be satisfied for the pod to be scheduled onto a candidate node.
If you specify multiple
nodeSelectorTerms
associated withnodeAffinity
types, then the pod can be scheduled onto a node if one of thenodeSelectorTerms
is satisfied.
If you specify multiple
matchExpressions
associated withnodeSelectorTerms
, then the pod can be scheduled onto a node only if allmatchExpressions
can be satisfied.
Inter-pod affinity and anti-affinity (beta feature)
Inter-pod affinity and anti-affinity allow you to constrain which nodes your pod is eligible to be scheduled based on labels on pods that are already running on the node rather than based on labels on nodes.
requiredDuringSchedulingIgnoredDuringExecution
preferredDuringSchedulingIgnoredDuringExecution
You express it using a
topologyKey
which is the key for the node label that the system uses to denote such a topology domain
spec: affinity: podAffinity: requiredDuringSchedulingIgnoredDuringExecution: - labelSelector: matchExpressions: - key: security operator: In values: - S1 topologyKey: failure-domain.beta.kubernetes.io/zone podAntiAffinity: preferredDuringSchedulingIgnoredDuringExecution: - weight: 100 podAffinityTerm: labelSelector: matchExpressions: - key: security operator: In values: - S2 topologyKey: kubernetes.io/hostname
The legal operators for pod affinity and anti-affinity are
In
,NotIn
,Exists
,DoesNotExist
A DaemonSet ensures that all (or some) Nodes run a copy of a Pod. As nodes are added to the cluster, Pods are added to them. As nodes are removed from the cluster, those Pods are garbage collected.
Some typical uses of a DaemonSet are:
- running a cluster storage daemon, such as
glusterd
,ceph
, on each node.- running a logs collection daemon on every node, such as
fluentd
orlogstash
.- running a node monitoring daemon on every node, such as Prometheus Node Exporter,
collectd
, Datadog agent, New Relic agent, or Gangliagmond
.
apiVersion: apps/v1beta2 # for versions before 1.8.0 use extensions/v1beta1 kind: DaemonSet
A Pod Template in a DaemonSet must have a
RestartPolicy
equal toAlways
, or be unspecified, which defaults toAlways
.
As of Kubernetes 1.8, you must specify a pod selector that matches the labels of the
.spec.template
.
If you specify a
.spec.template.spec.nodeSelector
, then the DaemonSet controller will create Pods on nodes which match that node selector. Likewise if you specify a.spec.template.spec.affinity
, then DaemonSet controller will create Pods on nodes which match that node affinity.
- The
unschedulable
field of a node is not respected by the DaemonSet controller.- The DaemonSet controller can make Pods even when the scheduler has not been started, which can help cluster bootstrap.
Daemon Pods do respect taints and tolerations, but they are created with
NoExecute
tolerations for the following taints with notolerationSeconds
:
node.kubernetes.io/not-ready
node.alpha.kubernetes.io/unreachable
If node labels are changed, the DaemonSet will promptly add Pods to newly matching nodes and delete Pods from newly not-matching nodes.
In Kubernetes version 1.6 and later, you can perform a rolling update on a DaemonSet.