- Node problem detector Daemonset to monitor node health
- It interacts with other daemons and reports the data to the apiserver as NodeCondition (e.g. out of disk, ready, memory pressure, disk pressure, network unavailable etc) and Event
- Kubernetes will not automatically take any action with the data sent to the apiserver
- Used for kernel log only
- Steps to implement node problem detector
- Create yaml configuration file for the daemonset
- Image for the container within the daemonset is k8s.gcr.io/node-problem-detector:v0.1
- securityContext for the container is privileged
- Apply the configuration using the kubectl apply -f command
- Create yaml configuration file for the daemonset
- Resource metrics pipeline provides a limited set of metrics and exposes them via a short term in memory metrics server and out to the user via the metrics.k8s.io api
- Metrics server discovers all nodes on the cluster and then queries each nodes kubelet for CPU and memory usage. The kubelet then fetches the individual container usages statistics from the container runtime through the container runtime interface. It then exposes the aggregated pod resource usage statistics through the metrics server resource metrics api.
- Prometheus is a separate opensource tool that can be used to monitor kubernetes, nodes, and prometheus itself