Kubernetes Prometheus And Grafana With kube-prometheus-stack

When using Kubernetes, it is mandatory to collect performance metrics so as to be able to plot them into meaningful graphs (a must-have during troubleshooting), visualize alerts in dashboards and get email notification of critical events. The best practice is to deploy Prometheus, Grafana and Alertmanager.

In the "Kubernetes Prometheus And Grafana With kube-prometheus-stack" post we see how to easily deploy and setup these three amazing components to setup a full featured alerting and performance monitoring solution.

The post also shows how to deploy additional Grafana Dashboards and how to deploy custom Prometheus rules to trigger application specific alerts.

What Is kube-prometheus-stack

The kube-prometheus-stack is not just an easy way not only to deploy Prometheus, Prometheus' Alertmanager and Grafana on Kubernetes: it also provides a collection of Kubernetes manifests, Grafana dashboards, and Prometheus Rules enabling to easily operate end-to-end Kubernetes cluster monitoring with Prometheus using the Prometheus Operator.

Luckily, they also provide an Helm chart to easily deploy it - please mind it has a dependency on the Grafana Helm chart.

Configure Helm

In this post I'm showing how to deal with this using the "helm" command line utility, but the same procedure can be used also with tools (such as Rancher), providing Helm's features through a UI.

In a clean and tidy setup, it's best to have a directory dedicated to the Helm's values files for each distinct deployment - just create the directory as follows:

mkdir ~/deployments

The purpose is having a common place where to store the values files for each single instance we deploy, so to easily deploy them again is necessary later on, for example to upgrade the containers, to apply the enhancement of an upgraded Helm chart or even just to change some settings.

Although this is not necessary for running Helm, it is wise to create a git repository within the "~/deployments" directory, so to be able to easily rollback configuration changes if anything would go wrong.

cd ~/deployments; git init

And of course configure it to push the commits to a remote repository. For more information on how to professionally operate with Git, you may want to read my post "GIT Tutorial - A thorough Version Control with Git Howto".

Add The Grafana Helm Repository

First, add the Grafana Helm chart's repository "https://grafana.github.io/helm-charts" as follows:

helm repo add grafana \
https://grafana.github.io/helm-charts

Add The Prometheus Helm Repository

Once done, since the "kube-prometheus-grafana" Helm chart has a dependency on the official Grafana project's Helm chart, we must then add the official Prometheus Community Helm chart's repository "https://prometheus-community.github.io/helm-charts" as well:

helm repo add prometheus \
https://prometheus-community.github.io/helm-charts

Setup kube-prometheus-stack

Now that Helm has properly been configured for fetching the necessary Helm charts, it is time to create the "kube-prometheus-stack" Helm's values file.

Grab A Copy Of The kube-prometheus-stack Values File

The most safe option, so to have it perfectly matching the latest available Helm chart version we are installing, is to download the helm chart itself to a local directory as follows:

helm pull prometheus/kube-prometheus-stack \
--untar

and then move the values.yaml file to the "~/deployments" directory previously created - please note how we are also renaming it meaningfully:

mv kube-prometheus-stack/values.yaml \
~/deployments/kube-prometheus-stack.yaml

unless we want to study this Helm chart's role internals, the contents of the whole "kube-prometheus-stack" chart are not necessary. For this reason, we can remove the directory as follows:

rm -rf kube-prometheus-stack

Customize The kube-prometheus-stack Values File

At this point we finally have a copy of the value file we can customize to fit our needs.

Customize Prometheus

The default Prometheus configuration is almost suitable for every use case - the only missing bit is having the ingress rule automatically created during the deployment. We can achieve this by adding the missing bits as by the below snippet:

prometheus:
  ingress:
    annotations: {}
    enabled: true
    hosts:
      - kube-p1-0.p1.carcano.corp
    labels: {}
    paths:
      - /
      - /alerts
      - /graph
      - /status
      - /tsdb-status
      - /flags
      - /config
      - /rules
      - /targets
      - /service-discovery

in this example the Prometheus URL's FQDN is "kube-p1-0.p1.carcano.corp", so - if you configured the load balancer and the Kubernetes Ingress controller as I described in the "RKE2 Tutorial - RKE2 Howto On Oracle Linux 9" post, Prometheus will be available at "https://kube-p1-0.p1.carcano.corp".

Customize Alertmanager

The next component to customize is Alertmanager - also in this case the default configuration is almost suitable for every use case - even here we are making a little bit of tuning to have the ingress rule automatically created during the deployment: just add the missing bits as by the below snippet:

alertmanager:
  alertmanagerSpec:
    ...
    routePrefix: /alertmanager
    ...
  config:
    ...
    receivers:
      - name: 'null'
      - email_configs:
          - from: kube-p1-0@carcano.corp
            require_tls: false
            send_resolved: true
            smarthost: mail-p1-0.p1.carcano.corp:587
            to: devops@carcano.corp
        name: global-email-receiver
      ...
      routes:
        - matchers:
            - alertname = "Watchdog"
          receiver: 'null'
        - matchers:
            - namespace != "preproduction"
          receiver: global-email-receiver
      ...
  ingress:
    annotations: {}
    enabled: true
    hosts:
      - kube-p1-0.p1.carcano.corp
    labels: {}
    paths:
      - /alertmanager

the Alertmanager tweaking is a little bit more complex:

we are configuring the ingress rule to publish it at "https://kube-p1-0.p1.carcano.corp/alertmanager" - FQDN is set at line 30, whereas the "/alertmanager" path is configured at line 33. In addition to that, alertmanager itself is configured to use "/alertmanager" as base path (line 4) - mind that it works only if you configured the load balancer and the Kubernetes Ingress controller as I described in the "RKE2 Tutorial - RKE2 Howto On Oracle Linux 9" post
we add the "global-email-receiver" alert receiver capable to send email notifications to "devops@carcano.corp" using "kube-p1-0@carcano.corp" as sender email address relying through the "mail-p1-0.p1.carcano.corp" SMTP server on port "587" (lines 10-16)
configure a matching route that routes to the "global-email-receiver" alert receiver every alert except the ones coming from the "preproduction" namespace (lines 22-24)

The above snippet provides the Alertmanager configuration hardcoded in the Helm Chart. It is anyway possible to provide it as a standalone YAML manifest of kind "AlertmanagerConfig". In this case, it is necessary to provide the Alertmanager's manifest name as "alertmanager.alertmanagerSpec.alertmanagerConfiguration", for example:

alertmanager:
  ...
  alertmanagerSpec:
    ...
    alertmanagerConfiguration:
      name: global-alertmanagerconfig

Be wary that despite being similar, Alertmanager configuration uses snake_case, whereas by the manifest of kind AlertmanagerConfig uses camelCase.

Customize Grafana

The last component we have to customize is Grafana: this component requires a little bit more tweaking. For example, since it supports authentication and authorization, it is certainly worth the effort to enable LDAP authentication.

In this example we are connecting to the LDAP service of an Active Directory server.

To guarantee the confidentiality of the transmitted data, we use LDAPS: this requires to first load a copy of the CA's certificate that issued the Active Directory service certificate into a configMap:

kubectl -n monitoring create configmap \
certs-configmap --from-file=my-ca-cert.pem

Once done, modify the Grafana part of the values file adding the below snippet:

grafana:
  adminUser: admin
  adminPassword: prom-operator
  ingress:
    enabled: true
    hosts:
      - kube-p1-0.p1.carcano.corp
    path: /grafana/
    pathType: Prefix
  extraConfigmapMounts:
    - name: certs-configmap
      mountPath: /etc/grafana/ssl/
      configMap: certs-configmap
      readOnly: true
  grafana.ini:
    auth.ldap:
      allow_sign_up: true
      config_file: /etc/grafana/ldap.toml
      enabled: true
    log:
      mode: console
    server:
      root_url: https://kube-p1-0.carcano.corp/grafana
      serve_from_sub_path: true
  ldap:
    config: |-
      verbose_logging = true
      [[servers]]
      host = "ad-ca-up1a001.p1.carcano.corp"
      port = 636
      use_ssl = true
      root_ca_cert = "/etc/grafana/ssl/CACert.pem"
      start_tls = false
      ssl_skip_verify = false
      bind_dn = "kube-p1-0@carcano.corp"
      bind_password = 'aG0df-We6'
      search_filter = "(sAMAccountName=%s)"
      search_base_dns = ["dc=carcano,dc=corp"]

      [servers.attributes]
      name = "givenName"
      surname = "sn"
      username = "sAMAccountName"
      member_of = "memberOf"
      email =  "mail"

      [[servers.group_mappings]]
      group_dn = "CN=Rancher-0-admins,OU=Groups,DC=carcano,DC=corp"
      org_role = "Admin"

      [[servers.group_mappings]]
      group_dn = "CN=Rancher-0-operators,OU=Groups,DC=carcano,DC=corp"
      org_role = "Editor"

      [[servers.group_mappings]]
      group_dn = "*"
      org_role = "Viewer"
    enabled: true
    existingSecret: ''

as you can probably guess, if you configured the load balancer and the Kubernetes Ingress controller as I described in the "RKE2 Tutorial - RKE2 Howto On Oracle Linux 9" post, Prometheus will be available at "https://kube-p1-0.p1.carcano.corp/grafana".

The above sample configures Grafana to bind to the "ad-ca-up1a001.p1.carcano.corp" Active Directory as user "kube-p1-0@carcano.corp".

It also grants:

administrative rights to the members of the "CN=Rancher-0-admins,OU=Groups,DC=carcano,DC=corp" Active Directory group
"Editor" rights to the members of the "CN=Rancher-0-operators,OU=Groups,DC=carcano,DC=corp"
"Viewer" rights to anyone else.

If you are dealing with an already deployed Grafana and you are looking for the administrative credentials, they are stored within the "kube-prometheus-stack-grafana" secret in the same namespace where "kube-prometheus-stack" is running. You can easily retrieve them as follows:

Username:

kubectl -n monitoring get secrets kube-prometheus-stack-grafana \
-o jsonpath="{.data.admin-user}" | base64 --decode ; echo

Password:

kubectl -n monitoring get secrets kube-prometheus-stack-grafana \
-o jsonpath="{.data.admin-password}" | base64 --decode ; echo

Deploy kube-prometheus-stack

We are finally ready to deploy the "kube-prometheus-stack" - first, just run a dry-run to check that everything was properly configured:

helm install -f ~/deployments/kube-prometheus-stack.yml \
--dry-run prometheus prometheus/kube-prometheus-stack

if the outcome is good (the Helm templates are properly rendered and printed to the standard output), then proceed with that actual deployment as follows:

helm install -f ~/deployments/kube-prometheus-stack.yml \
prometheus prometheus/kube-prometheus-stack

and just wait until the deployment completes - if you are using git for versioning the values file, this is the time for committing the changes.

At the time of writing this post, the "kube-prometheus-stack" Helm chart (65.8.1) adds the unnecessary entry "k8s-app=kube-proxy" to the "kube-prometheus-stack-kube-proxy" service selector. Sadly this causes a mismatch resulting in no pods' endpoints bound to service, with the consequence of Prometheus not being able to scrape them, and so raising an alert in Alertmanager.

This can be easily checked by running:

kubectl -n kube-system describe svc \
kube-prometheus-stack-kube-proxy | grep Selector

If the selector's output is different from:

Selector:                 component=kube-proxy

Then you must remove any additional entry, leaving only the "component=kube-proxy" item.

Another cause of problems can be having the "kube-controller-manager", "kube-scheduler" and of course the "kube-proxy"endpoints bound to the host's loopback interface: this way they will be unreachable to Prometheus, causing alerts to be fired. The solution is starting them with the related option (such as "bind-address=0.0.0.0"), to prevent that strict binding: you can find examples in the "RKE2 Tutorial - RKE2 Howto On Oracle Linux 9" post.

Set Up Additional Rules

AlertManager Rules

The default Alertmanager configuration we provided sends email alerts for everything, inhibiting only the notification of the Watchdog rule - the watchdog rule is used only to make sure alerts are raised in Alertmanager - it's not a "real" alert.

If you want to have a more fine grained control on it, you can add namespace specific AlertmanagerConfig manifests with additional settings.

For example, the following manifest configures Alertmanager to notify "dba@carcano.corp" when alerts are raised in the "databases" namespace:

apiVersion: monitoring.coreos.com/v1alpha1
kind: AlertmanagerConfig
metadata:
  name: dba-alerts
  namespace: databases
spec:
  receivers:
    - name: 'dba-email-receiver'
      emailConfigs:
        - to: 'dba@carcano.corp'
          sendResolved: true
          smarthost: 'mail-p1-0.p1.carcano.corp:587'
          from: 'kube-p1-0@carcano.corp'
          requireTLS: no
  route:
    groupBy: ['node']
    groupWait: 30s
    groupInterval: 5m
    receiver: 'dba-email-receiver'
    repeatInterval: 10m
    matchers:
      - name: alertname
        regex: true
        value: '.*'

Prometheus Rules

It is of course possible to define custom Prometheus rules to trigger alerts on application specific events computed by checking application-specific metrics.

Just as an example:

apiVersion: monitoring.coreos.com/v1
kind: PrometheusRule
metadata:
  labels:
    app: kube-prometheus-stack
    app.kubernetes.io/instance: kube-prometheus-stack
    release: kube-prometheus-stack
  name: kube-prometheus-stack.carcano.rules.redis
  namespace: cicd
spec:
  groups:
    - name: outages
      rules:
        - alert: RedisDown
          expr: redis_up == 0
          for: 0m
          labels:
            severity: critical
          annotations:
            summary: Redis down (instance {{ $labels.instance }})
            description: "Redis instance is down\n  VALUE = {{ $value }}\n  LABELS = {{ $labels }}"
        - alert: RedisRejectedConnections
          expr: increase(redis_rejected_connections_total[1m]) > 0
          for: 0m
          labels:
            severity: critical
          annotations:
            summary: Redis rejected connections (instance {{ $labels.instance }})
            description: "Some connections to Redis has been rejected\n  VALUE = {{ $value }}\n  LABELS = {{ $labels }}"

the above manifest defines the "kube-prometheus-stack.carcano.rules.redis" Prometheus rule as follows:

applies to the "cicd" namespace only (line9 )
defines a group of rules called "outages" (lines 12 - 29) with the following alerts:
- RedisDown: triggered when the "redis_up" metric is equal to 0 (lines 14 - 21)
- RedisRejectedConnections: triggered when the "redis_rejected_connections_total" metric per each minute are greater than 0 (lines 24 - 29)

You can of course use the above example as the starting point for writing your own application specific Prometheus rule

Monitoring A Workload

Let's see a real-world example of adding performance monitoring to a shelf application deployed with an Helm chart.

The steps for enabling performance metrics collecting are:

enable metrics - there is often a specific section in the Helm's values file - just look for it and set enabled to true.
enable service monitor - there is often a specific section in the Helm's values file - just look for it and set enabled to true

Please make sure the label "release: kube-prometheus-stack" is really assigned to the service monitor - check it after the deployment of the Helm chart: if they are missing, Prometheus will ignore it.

You must then add a dashboard - the best practice is to look for an already existing one from, and only if it does not already exist, create your own, possibly starting from something quite similar already existent.

As an example, if you want to add the Grafana Redis HA Dashboard, first download its JSON file and rename it to "redis-ha-11835_rev1.json".

Then, add the "grafana-redis-ha" config map in the namespace of the monitored application, assigning it the label "grafana_dashboard=1", uploading the "redis-ha-11835_rev1.json" file into it.

Just wait a few, and the Dashboard will appear in the list of the available dashboards.

The Stack In Action

The "kube-prometheus-stack" is really an invaluable ally to monitor and detect problems: the most advanced dashboards do help you to make sense even of performance metrics tightly bound to the specific application business logic.

In our example setup you can access Grafana by connecting to the URL "https://kube-p1-0.p1.carcano.corp/grafana", logging in with your Active Directory credentials.

The below screenshot for example is of a KeyDB instance:

As you see it renders in a easy to understand way business-related metrics such as the number of keys for each db, the number of expiring and not expiring keys and so on - having such a detailed view can really save your life when things start going in an unexpected way, since they greatly help you in making sense of what's happening inside the application.

A very useful dashboard to see the lasting of problems is the "Alertmanager / Overview" - you can see an example from the below picture:

As you can see, it's showing a timeline with the number of alerts as well as the number of email notifications sent and their duration - these graphs are really useful when drawing up incident reports.

In the above example, you can immediately guess that something serious is happening, since the alerts suddenly raised in number and then stabilized.

Besides checking the notification emails, you can check the Alertmanager dashboard - in our setup you can access it by connecting to the "https://kube-p1-0.p1.carcano.corp/alertmanager" URL.

In the below picture is related to the situation depicted by the previous one:

as you see there was really a huge accident: we lost a Kubernetes master node! You can guess that because of the etcd, KubeNode and KubeDaemonSetRollout alarms.

You may have noted among the various alarms the "Watchdog" - as we said this is an alarm to make sure Alertmanager is properly firing alarms, it must be always firing, but notification are inhibited by the default Alertmanager configuration to avoid false heads up.

Footnotes

We have come to the end of this post. As you saw, deploying Prometheus and Grafana using the "kube-prometheus-stack" is really quite an easy walk: installing this amazing suite enables you to have an inexpensive yet very effective performance monitoring solution.
If you liked this post, … please give your tip in the small cup below:

I hate blogs with pop-ups, ads and all the (even worse) other stuff that distracts from the topics you're reading and violates your privacy. I want to offer my readers the best experience possible for free, ... but please be wary that for me it's not really free: on top of the raw costs of running the blog, I usually spend on average 50-60 hours writing each post. I offer all this for free because I think it's nice to help people, but if you think something in this blog has helped you professionally and you want to give concrete support, your contribution is very much appreciated: you can just use the above button.