Kubernetes Scheduling - DaemonSet

Kubernetes Scheduling - DaemonSet

Rakesh Jain's photo
Rakesh Jain

Published on Nov 30, 2021

7 min read

Subscribe to our newsletter and never miss any upcoming articles

Listen to this article

Table of contents

What are DaemonSets?

DeamonSets are like Replica Sets, as they help you deploy multiple instances of pod, but it runs one copy of your pod on each node in your Kubernetes cluster.

A DaemonSet ensures that all (or some) Nodes run a copy of a Pod. As nodes are added to the cluster, Pods are added to them. As nodes are removed from the cluster, those Pods are garbage collected. Deleting a DaemonSet will clean up the Pods it created.

Use cases of DaemonSets

Following are some typical use cases of DaemonSets:

  • When you would like to deploy a monitoring agent on each of your nodes in the cluster so you can monitor your cluster better. Then you don't have to worry about adding or removing monitoring agents from these nodes when there are changes in your cluster as the DaemonSet will take care of that for you. Such as Prometheus Node Exporter, Datadog agent etc.

  • A log collector running on each of your nodes in the cluster. For example logstash, fluentd.

  • A cluster storage daemon running on every nodes in the cluster. For example glusterd - ceph.

  • For networking solutions for example Weavenet would requires an Weavenet agent running on every node in the cluster.

  • kube-proxy is another good use case of DaemonSets. The reason behind that is that kube-proxy is needed on every node in the cluster to run IP tables, so that every node can access every pod no matter on which node it resides. Hence, when we make kube-proxy a daemonset and another node is added to the cluster at a later time, kube-proxy is automatically spawned on that node.

In a simple case, one DaemonSet, covering all nodes, would be used for each type of daemon. We can also configure multiple DaemonSets for a single type of daemon with different flags, memory, CPU, etc.

Kube-dns responsibility is to discover a service IP using its name and only one replica of kube-dns is enough to resolve the service name to its IP. Hence we make kube-dns a deployment, because we don't need kube-dns on every node.

How to schedule DaemonSet?

A Daemonset controller manages pods like Deployments, ReplicaSets, and StatefulSets.

Create a DaemonSet

Kubernetes_daemonset.png

In this example, we’ll deploy prometheus node-exporter agent pod on each node of the cluster.

The following is an example of a DaemonSet manifest file:

root@kube-master:~/DaemonSet# cat daemonset_demo.yml
apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: daemonset-demo-prometheus
spec:
  selector:
    matchLabels:
      tier: monitoring
      name: prometheus-exporter  # Label selector that determines which Pods belong to the DaemonSet
  template:
    metadata:
      labels:
        tier: monitoring # Pod template's label selector
        name: prometheus-exporter # Pod template's label selector
    spec:
      containers:
      - name: prometheus
        image: prom/node-exporter
        ports:
        - containerPort: 80

Let us understand what the above manifest file is all about.

  • A DaemonSet named daemonset-demo-prometheus is created, indicated by the metadata: name field.
  • DaemonSet's Pod is labelled prometheus-exporter and tier as monitoring.
  • The Pod's container pulls the latest node-exporter image. The container image is hosted by Container Registry.

Create DaemonSet by applying the manifest file using kubectl command.

root@kube-master:~/DaemonSet# kubectl apply -f daemonset_demo.yml --record
daemonset.apps/daemonset-demo-prometheus created

The --record flag will track changes made through each revision.

Verify the DaemonSet created :

root@kube-master:~/DaemonSet# kubectl get daemonsets daemonset-demo-prometheus -o wide
NAME                        DESIRED   CURRENT   READY   UP-TO-DATE   AVAILABLE   NODE SELECTOR   AGE     CONTAINERS   IMAGES               SELECTOR
daemonset-demo-prometheus   3         3         3       3            3           <none>          2m13s   prometheus   prom/node-exporter   name=prometheus-exporter,tier=monitoring

Get more details about the DaemonSet created :

root@kube-master:~/DaemonSet# kubectl describe daemonsets daemonset-demo-prometheus
Name:           daemonset-demo-prometheus
Selector:       name=prometheus-exporter,tier=monitoring
Node-Selector:  <none>
Labels:         <none>
Annotations:    deprecated.daemonset.template.generation: 1
                kubernetes.io/change-cause: kubectl apply --filename=daemonset_demo.yml --record=true
Desired Number of Nodes Scheduled: 3
Current Number of Nodes Scheduled: 3
Number of Nodes Scheduled with Up-to-date Pods: 3
Number of Nodes Scheduled with Available Pods: 3
Number of Nodes Misscheduled: 0
Pods Status:  3 Running / 0 Waiting / 0 Succeeded / 0 Failed
Pod Template:
  Labels:  name=prometheus-exporter
           tier=monitoring
  Containers:
   prometheus:
    Image:        prom/node-exporter
    Port:         80/TCP
    Host Port:    0/TCP
    Environment:  <none>
    Mounts:       <none>
  Volumes:        <none>
Events:
  Type    Reason            Age    From                  Message
  ----    ------            ----   ----                  -------
  Normal  SuccessfulCreate  2m52s  daemonset-controller  Created pod: daemonset-demo-prometheus-2l57n
  Normal  SuccessfulCreate  2m52s  daemonset-controller  Created pod: daemonset-demo-prometheus-8sgfk
  Normal  SuccessfulCreate  2m52s  daemonset-controller  Created pod: daemonset-demo-prometheus-ffbp6

Getting details pods in DaemonSet:

root@kube-master:~/DaemonSet# kubectl get pods -o wide
NAME                              READY   STATUS    RESTARTS   AGE    IP                NODE           NOMINATED NODE   READINESS GATES
daemonset-demo-prometheus-2l57n   1/1     Running   0          5m1s   192.168.221.241   kube-master    <none>           <none>
daemonset-demo-prometheus-8sgfk   1/1     Running   0          5m1s   192.168.194.4     kube-worker1   <none>           <none>
daemonset-demo-prometheus-ffbp6   1/1     Running   0          5m1s   192.168.161.240   kube-worker2   <none>           <none>

You can see from the output above, there are 3 pods deployed. One on each the master and the worker nodes. Here the DaemonSet made sure we had one pod per node without using any affinity rules.

Delete a DaemonSet

Like all other Kubernetes object DaemonSet also can be deleted either by using the same manifest file it was created with or by using the kubectl command.

root@kube-master:~/DaemonSet# kubectl delete -f daemonset_demo.yml
daemonset.apps "daemonset-demo-prometheus" deleted
root@kube-master:~/DaemonSet# kubectl delete daemonsets daemonset-demo-prometheus
daemonset.apps "daemonset-demo-prometheus" deleted

How to update DaemonSets?

DaemonSet can be updated by changing following specifications:

  • Pod specification
  • resource requests
  • limits
  • labels
  • Annotations

We can use either of two update strategies to update a Daemonset:

RollingUpdate - This is the default strategy and will be applied upon configuration change. It automatically deletes and recreates the DaemonSet pods.

OnDelete - When using this strategy Kubernetes does not automatically delete the DaemonSet pods and recreate them in case of configuration changes. You have to delete pods manually, and then controller will create a new pods with new changes.

How Daemon Pods are scheduled

A DaemonSet ensures that all eligible nodes run a copy of a Pod. Normally Kubernetes scheduler decides the node that a Pod runs on. However, DaemonSet pods are created and scheduled by the DaemonSet controller instead.

But that causes following issues:

  • Inconsistent Pod behavior: Normal Pods waiting to be scheduled are created and remain in Pending state, but DaemonSet pods are not created in Pending state. This is confusing to the user.

  • Pod preemption is handled by default scheduler. When preemption is enabled, the DaemonSet controller will make scheduling decisions without considering pod priority and preemption.

What is the Solution?

ScheduleDaemonSetPods allows you to schedule DaemonSets using the default scheduler instead of the DaemonSet controller, by adding the NodeAffinity term to the DaemonSet pods, instead of the .spec.nodeName term. The default scheduler is then used to bind the pod to the target host.

Following manifest file creates a pod using node affinity rules.

apiVersion: v1
kind: Pod
metadata:
  name: web-servers
spec:
  affinity:
    nodeAffinity:
      requiredDuringSchedulingIgnoredDuringExecution:
        nodeSelectorTerms:
        - matchExpressions:
          - key: workload
            operator: In
            values:
            - staging
  containers:
  - name: nginx
    image: nginx
    imagePullPolicy: IfNotPresent

If you observe above manifest file we have a Pod scheduled with requiredDuringSchedulingIgnoredDuringExecution type Node Affinity and the expression key-value as workload: staging. This means that the pod will only be scheduled on a node that has a workload=staging label.

Taints and Tolerations

Daemon Pods respect taints and tolerations. The node.kubernetes.io/unschedulable:NoSchedule toleration is added automatically to DaemonSet Pods. The default scheduler ignores unschedulable Nodes when scheduling DaemonSet Pods.

Communicating with Daemon Pods

There are different ways we can communicate with pods created by DaemonSets.

  • Push : Using this way the pods can be configured to send information such as stats database or monotoring service to other services. They do not have clients.

  • NodeIP & Known Port : Pods can be reached via their respective node IPs using hostPort. Users can use the known ports and the node IP to communicate with the pods.

  • DNS : We can create a headless service with the same pod selector, and then discover DaemonSets using the endpoints resource or retrieve multiple A records from DNS.

  • Service : We can create a service with the same Pod selector, and use the service to reach a daemon on a random node. (No way to reach specific node.)

Summary

DaemonSets are very useful when you want to deploy ongoing background tasks that you need to run on all or certain nodes without any user intervention. Example of such tasks are log collection daemons like logstash, storage daemons like ceph, and node monitoring daemons like prometheus node-exporter etc.

This is all about Kubernetes DaemonSet.

Hope you like the tutorial. Stay tuned and don't forget to provide your feedback in the response section.

Happy Learning!

 
Share this