One place for hosting & domains

      Prometheus

      How to Deploy Prometheus Operator and Grafana on Linode Kubernetes Engine


      Updated by Linode Written by Ben Bigger

      In this guide, you will deploy the Prometheus Operator to your Linode Kubernetes Engine (LKE) cluster using Helm, either as:

      The Prometheus Operator Monitoring Stack

      When administrating any system, effective monitoring tools can empower users to perform quick and effective issue diagnosis and resolution. This need for monitoring solutions has led to the development of several prominent open source tools designed to solve problems associated with monitoring diverse systems.

      Since its release in 2016, Prometheus has become a leading monitoring tool for containerized environments including Kubernetes. Alertmanager is often used with Prometheus to send and manage alerts with tools such as Slack. Grafana, an open source visualization tool with a robust web interface, is commonly deployed along with Prometheus to provide centralized visualization of system metrics.

      The community-supported Prometheus Operator Helm Chart provides a complete monitoring stack including each of these tools along with Node Exporter and kube-state-metrics, and is designed to provide robust Kubernetes monitoring in its default configuration.

      While there are several options for deploying the Prometheus Operator, using Helm, a Kubernetes “package manager,” to deploy the community-supported the Prometheus Operator enables you to:

      • Control the components of your monitoring stack with a single configuration file.
      • Easily manage and upgrade your deployments.
      • Utilize out-of-the-box Grafana interfaces built for Kubernetes monitoring.

      Before You Begin

      Note

      1. Deploy an LKE Cluster. This guide was written using an example node pool with three 2 GB Linodes. Depending on the workloads you will be deploying on your cluster, you may consider using Linodes with more available resources.

      2. Install Helm 3 to your local environment.

      3. Install kubectl to your local environment and connect to your cluster.

      4. Create the monitoring namespace on your LKE cluster:

        kubectl create namespace monitoring
        
      5. Create a directory named lke-monitor to store all of your Helm values and Kubernetes manifest files and move into the new directory:

        mkdir ~/lke-monitor && cd ~/lke-monitor
        
      6. Add the Google stable Helm charts repository to your Helm repos:

        helm repo add stable https://kubernetes-charts.storage.googleapis.com/
        
      7. Update your Helm repositories:

        helm repo update
        
      8. (Optional) For public access with HTTPS and basic auth configured for your web interfaces of your monitoring tools:

        • Purchase a domain name from a reliable domain registrar and configure your registrar to use Linode’s nameservers with your domain. Using Linode’s DNS Manager, create a new Domain for the one that you have purchased.

        • Ensure that htpasswd is installed to your local environment. For many systems, this tool has already been installed. Debian and Ubuntu users will have to install the apache2-utils package with the following command:

          sudo apt install apache2-utils
          

      Prometheus Operator Minimal Deployment

      In this section, you will complete a minimal deployment of the Prometheus Operator for individual/local access with kubectl Port-Forward. If you require your monitoring interfaces to be publicly accessible over the internet, you can skip to the following section on completing a Prometheus Operator Deployment with HTTPS and Basic Auth.

      Deploy Prometheus Operator

      In this section, you will create a Helm chart values file and use it to deploy Prometheus Operator to your LKE cluster.

      1. Using the text editor of your choice, create a file named values.yaml in the ~/lke-monitor directory and save it with the configurations below. Since the control plane is Linode-managed, as part of this step we will also disable metrics collection for the control plane component:

        Caution

        The below configuration will establish persistent data storage with three separate 10GiB Block Storage Volumes for Prometheus, Alertmanager, and Grafana. Because the Prometheus Operator deploys as StatefulSets, these Volumes and their associated Persistent Volume resources must be deleted manually if you later decide to tear down this Helm release.
        ~/lke-monitor/values.yaml
         1
         2
         3
         4
         5
         6
         7
         8
         9
        10
        11
        12
        13
        14
        15
        16
        17
        18
        19
        20
        21
        22
        23
        24
        25
        26
        27
        28
        29
        30
        31
        32
        33
        34
        35
        36
        37
        
        # Prometheus Operator Helm Chart values for Linode Kubernetes Engine minimal deployment
        prometheus:
          prometheusSpec:
            storageSpec:
              volumeClaimTemplate:
                spec:
                  storageClassName: linode-block-storage-retain
                  resources:
                    requests:
                      storage: 10Gi
        
        alertmanager:
          alertmanagerSpec:
            storage:
              volumeClaimTemplate:
                spec:
                  storageClassName: linode-block-storage-retain
                  resources:
                    requests:
                      storage: 10Gi
        
        grafana:
          persistence:
            enabled: true
            storageClassName: linode-block-storage-retain
            size: 10Gi
        
        # Disable metrics for Linode-managed Kubernetes control plane elements
        kubeEtcd:
          enabled: false
        
        kubeControllerManager:
          enabled: false
        
        kubeScheduler:
          enabled: false
            
      2. Export an environment variable to store your Grafana admin password:

        Note

        Replace prom-operator in the below command with a secure password and save the password for later reference.

        export GRAFANA_ADMINPASSWORD="prom-operator"
        
      3. Using Helm, deploy a Prometheus Operator release labeled lke-monitor in the monitoring namespace on your LKE cluster with the settings established in your values.yaml file:

        helm install 
        lke-monitor stable/prometheus-operator 
        -f ~/lke-monitor/values.yaml 
        --namespace monitoring 
        --set grafana.adminPassword=$GRAFANA_ADMINPASSWORD 
        --set prometheusOperator.createCustomResource=false
        

        Note

        You can safely ignore messages similar to manifest_sorter.go:192: info: skipping unknown hook: "crd-install" as discussed in this Github issues thread.

        Alternatively, you can add --set prometheusOperator.createCustomResource=false to the above command to prevent the message from appearing.

      4. Verify that the Prometheus Operator has been deployed to your LKE cluster and its components are running and ready by checking the pods in the monitoring namespace:

        kubectl -n monitoring get pods
        

        You should see a similar output to the following:

          
        NAME                                                     READY   STATUS    RESTARTS   AGE
        alertmanager-lke-monitor-prometheus-ope-alertmanager-0   2/2     Running   0          45s
        lke-monitor-grafana-84cbb54f98-7gqtk                     2/2     Running   0          54s
        lke-monitor-kube-state-metrics-68c56d976f-n587d          1/1     Running   0          54s
        lke-monitor-prometheus-node-exporter-6xt8m               1/1     Running   0          53s
        lke-monitor-prometheus-node-exporter-dkc27               1/1     Running   0          53s
        lke-monitor-prometheus-node-exporter-pkc65               1/1     Running   0          53s
        lke-monitor-prometheus-ope-operator-f87bc9f7c-w56sw      2/2     Running   0          54s
        prometheus-lke-monitor-prometheus-ope-prometheus-0       3/3     Running   1          35s
            
        

      Access Monitoring Interfaces with Port-Forward

      1. List the services running in the monitoring namespace and review their respective ports:

        kubectl -n monitoring get svc
        

        You should see an output similar to the following:

          
        NAME                                      TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)                     AGE
        alertmanager-operated                     ClusterIP   None            <none>        9093/TCP,9094/TCP,9094/UDP  115s
        lke-monitor-grafana                       ClusterIP   10.128.140.155  <none>        80/TCP                      2m3s
        lke-monitor-kube-state-metrics            ClusterIP   10.128.165.34   <none>        8080/TCP                    2m3s
        lke-monitor-prometheus-node-exporter      ClusterIP   10.128.192.213  <none>        9100/TCP                    2m3s
        lke-monitor-prometheus-ope-alertmanager   ClusterIP   10.128.153.6    <none>        9093/TCP                    2m3s
        lke-monitor-prometheus-ope-operator       ClusterIP   10.128.198.160  <none>        8080/TCP,443/TCP            2m3s
        lke-monitor-prometheus-ope-prometheus     ClusterIP   10.128.121.47   <none>        9090/TCP                    2m3s
        prometheus-operated                       ClusterIP   None            <none>        9090/TCP                    105s
            
        

        From the above output, the resource services you will access have the corresponding ports:

        ResourceService NamePort
        Prometheuslke‑monitor‑prometheus‑ope‑prometheus9090
        Alertmanagerlke‑monitor‑prometheus‑ope‑alertmanager9093
        Grafanalke‑monitor‑grafana80
      2. Use kubectl port-forward to open a connection to a service, then access the service’s interface by entering the corresponding address in your web browser:

        Note

        Press control+C on your keyboard to terminate a port-forward process after entering any of the following commands.

        • To provide access to the Prometheus interface at the address 127.0.0.1:9090 in your web browser, enter:

          kubectl -n monitoring 
          port-forward 
          svc/lke-monitor-prometheus-ope-prometheus 
          9090
          
        • To provide access to the Alertmanager interface at the address 127.0.0.1:9093 in your web browser, enter:

          kubectl -n monitoring 
          port-forward 
          svc/lke-monitor-prometheus-ope-alertmanager  
          9093
          
        • To provide access to the Grafana interface at the address 127.0.0.1:8081 in your web browser, enter:

          kubectl -n monitoring 
          port-forward 
          svc/lke-monitor-grafana  
          8081:80
          

          Log in with the username admin and the password you exported as $GRAFANA_ADMINPASSWORD. The Grafana dashboards are accessible at Dashboards > Manage from the left navigation bar.

      Prometheus Operator Deployment with HTTPS and Basic Auth

      Note

      Before you start on this section, ensure that you have completed all of the steps in Before you Begin.

      This section will show you how to install and configure the necessary components for secure, path-based, public access to the Prometheus, Alertmanager, and Grafana interfaces using the domain you have configured for use with Linode.

      An Ingress is used to provide external routes, via HTTP or HTTPS, to your cluster’s services. An Ingress Controller, like the NGINX Ingress Controller, fulfills the requirements presented by the Ingress using a load balancer.

      To enable HTTPS on your monitoring interfaces, you will create a Transport Layer Security (TLS) certificate from the Let’s Encrypt certificate authority (CA) using the ACME protocol. This will be facilitated by cert-manager, the native Kubernetes certificate management controller.

      While the Grafana interface is natively password-protected, the Prometheus and Alertmanager interfaces must be secured by other means. This guide covers basic authentication configurations to secure the Prometheus and Alertmanager interfaces.

      If you are completing this section of the guide after completing a Prometheus Operator Minimal Deployment, you can use Helm to upgrade your release and maintain the persistent data storage for your monitoring stack.

      Install the NGINX Ingress Controller

      In this section, you will install the NGINX Ingress Controller using Helm, which will create a NodeBalancer to handle your cluster’s traffic.

      1. Install the Google stable NGINX Ingress Controller Helm chart:

        helm install nginx-ingress stable/nginx-ingress
        
      2. Access your NodeBalancer’s assigned external IP address.

        kubectl -n default get svc -o wide nginx-ingress-controller
        

        The command will return a similar output to the following:

          
        NAME                       TYPE           CLUSTER-IP      EXTERNAL-IP    PORT(S)                      AGE   SELECTOR
        nginx-ingress-controller   LoadBalancer   10.128.41.200   192.0.2.0      80:30889/TCP,443:32300/TCP   59s   app.kubernetes.io/component=controller,app=nginx-ingress,release=nginx-ingress
            
        
      3. Copy the IP address of the EXTERNAL IP field and navigate to Linode’s DNS Manager and create an A record using this external IP address and a hostname value corresponding to the subdomain you plan to use with your domain.

      Now that your NGINX Ingress Controller has been deployed and your domain’s A record has been updated, you are ready to enable HTTPS on your monitoring interfaces.

      Install cert-manager

      Note

      Before performing the commands in this section, ensure that your DNS has had time to propagate across the internet. You can query the status of your DNS by using the following command, substituting example.com for your domain (including a subdomain if you have configured one).

      dig +short example.com
      

      If successful, the output should return the IP address of your NodeBalancer.

      1. Install cert-manager’s CRDs.

        kubectl apply --validate=false -f https://github.com/jetstack/cert-manager/releases/download/v0.15.2/cert-manager.crds.yaml
        
      2. Add the Helm repository which contains the cert-manager Helm chart.

        helm repo add jetstack https://charts.jetstack.io
        
      3. Update your Helm repositories.

        helm repo update
        
      4. Install the cert-manager Helm chart. These basic configurations should be sufficient for many use cases, however, additional cert-manager configurable parameters can be found in cert-manager’s official documentation.

        helm install 
        cert-manager jetstack/cert-manager 
        --namespace cert-manager 
        --version v0.15.2
        
      5. Verify that the corresponding cert-manager pods are running and ready.

        kubectl -n cert-manager get pods
        

        You should see a similar output:

          
        NAME                                       READY   STATUS    RESTARTS   AGE
        cert-manager-749df5b4f8-mc9nj              1/1     Running   0          19s
        cert-manager-cainjector-67b7c65dff-4fkrw   1/1     Running   0          19s
        cert-manager-webhook-7d5d8f856b-4nw9z      1/1     Running   0          19s
            
        

      Create a ClusterIssuer Resource

      Now that cert-manager is installed and running on your cluster, you will need to create a ClusterIssuer resource which defines which CA can create signed certificates when a certificate request is received. A ClusterIssuer is not a namespaced resource, so it can be used by more than one namespace.

      1. Using the text editor of your choice, create a file named acme-issuer-prod.yaml with the example configurations, replacing the value of email with your own email address for the ACME challenge:

        ~/lke-monitor/acme-issuer-prod.yaml
         1
         2
         3
         4
         5
         6
         7
         8
         9
        10
        11
        12
        13
        14
        15
        
        apiVersion: cert-manager.io/v1alpha2
        kind: ClusterIssuer
        metadata:
          name: letsencrypt-prod
        spec:
          acme:
            email: [email protected]
            server: https://acme-v02.api.letsencrypt.org/directory
            privateKeySecretRef:
              name: letsencrypt-secret-prod
            solvers:
            - http01:
                ingress:
                  class: nginx
            
        • This manifest file creates a ClusterIssuer resource that will register an account on an ACME server. The value of spec.acme.server designates Let’s Encrypt’s production ACME server, which should be trusted by most browsers.

          Note

          Let’s Encrypt provides a staging ACME server that can be used to test issuing trusted certificates, while not worrying about hitting Let’s Encrypt’s production rate limits. The staging URL is https://acme-staging-v02.api.letsencrypt.org/directory.
        • The value of privateKeySecretRef.name provides the name of a secret containing the private key for this user’s ACME server account (this is tied to the email address you provide in the manifest file). The ACME server will use this key to identify you.

        • To ensure that you own the domain for which you will create a certificate, the ACME server will issue a challenge to a client. cert-manager provides two options for solving challenges, http01 and DNS01. In this example, the http01 challenge solver will be used and it is configured in the solvers array. cert-manager will spin up challenge solver Pods to solve the issued challenges and use Ingress resources to route the challenge to the appropriate Pod.

      2. Create the ClusterIssuer resource:

        kubectl apply -f ~/lke-monitor/acme-issuer-prod.yaml
        

      Create a Certificate Resource

      After you have a ClusterIssuer resource, you can create a Certificate resource. This will describe your x509 public key certificate and will be used to automatically generate a CertificateRequest which will be sent to your ClusterIssuer.

      1. Using the text editor of your choice, create a file named certificate-prod.yaml with the example configurations:

        Note

        Replace the value of spec.dnsNames with the domain, including subdomains, that you will use to host your monitoring interfaces.

        ~/lke-monitor/certificate-prod.yaml
         1
         2
         3
         4
         5
         6
         7
         8
         9
        10
        11
        12
        13
        14
        15
        
        apiVersion: cert-manager.io/v1alpha2
        kind: Certificate
        metadata:
          name: prometheus-operator-prod
          namespace: monitoring
        spec:
          secretName: letsencrypt-secret-prod
          duration: 2160h # 90d
          renewBefore: 360h # 15d
          issuerRef:
            name: letsencrypt-prod
            kind: ClusterIssuer
          dnsNames:
          - example.com
            

        Note

        The configurations in this example create a Certificate in the monitoring namespace that is valid for 90 days and renews 15 days before expiry.

      2. Create the Certificate resource:

        kubectl apply -f ~/lke-monitor/certificate-prod.yaml
        
      3. Verify that the Certificate has been successfully issued:

        kubectl -n monitoring get certs
        

        When your certificate is ready, you should see a similar output:

          
        NAME          READY   SECRET                    AGE
        lke-monitor   True    letsencrypt-secret-prod   33s
            
        

      Next, you will create the necessary resources for basic authentication of the Prometheus and Alertmanager interfaces.

      Configure Basic Auth Credentials

      In this section, you will use htpasswd to generate credentials for basic authentication and create a Kubernetes Secret, which will then be applied to your Ingress configuration to secure access to your monitoring interfaces.

      1. Create a basic authentication password file for the user admin:

        htpasswd -c ~/lke-monitor/auth admin
        

        Follow the prompts to create a secure password, then store your password securely for future reference.

      2. Create a Kubernetes Secret for the monitoring namespace using the file you created above:

        kubectl -n monitoring create secret generic basic-auth --from-file=auth
        
      3. Verify that the basic-auth secret has been created on your LKE cluster:

        kubectl -n monitoring get secret basic-auth
        

        You should see a similar output to the following:

          
        NAME         TYPE     DATA   AGE
        basic-auth   Opaque   1      81s
            
        

      All the necessary components are now in place to be able to enable HTTPS on your monitoring interfaces. In the next section, you will complete the steps needed to deploy Prometheus Operator.

      Deploy or Upgrade Prometheus Operator

      In this section, you will create a Helm chart values file and use it to deploy Prometheus Operator to your LKE cluster.

      1. Using the text editor of your choice, create a file named values-https-basic-auth.yaml in the ~/lke-monitor directory and save it with the configurations below. Since the control plane is Linode-managed, as part of this step we will also disable metrics collection for the control plane component:

        Note

        Caution

        The below configuration will establish persistent data storage with three separate 10GiB Block Storage Volumes for Prometheus, Alertmanager, and Grafana. Because the Prometheus Operator deploys as StatefulSets, these Volumes and their associated Persistent Volume resources must be deleted manually if you later decide to tear down this Helm release.
        ~/lke-monitor/values-https-basic-auth.yaml
         1
         2
         3
         4
         5
         6
         7
         8
         9
        10
        11
        12
        13
        14
        15
        16
        17
        18
        19
        20
        21
        22
        23
        24
        25
        26
        27
        28
        29
        30
        31
        32
        33
        34
        35
        36
        37
        38
        39
        40
        41
        42
        43
        44
        45
        46
        47
        48
        49
        50
        51
        52
        53
        54
        55
        56
        57
        58
        59
        60
        61
        62
        63
        64
        65
        66
        67
        68
        69
        70
        71
        72
        73
        74
        75
        76
        77
        78
        79
        80
        81
        82
        83
        84
        85
        86
        87
        88
        89
        90
        91
        92
        93
        94
        95
        
        # Helm chart values for Prometheus Operator with HTTPS and basic auth
        prometheus:
          ingress:
            enabled: true
            annotations:
              kubernetes.io/ingress.class: nginx
              nginx.ingress.kubernetes.io/rewrite-target: /$2
              cert-manager.io/cluster-issuer: letsencrypt-prod
              nginx.ingress.kubernetes.io/auth-type: basic
              nginx.ingress.kubernetes.io/auth-secret: basic-auth
              nginx.ingress.kubernetes.io/auth-realm: 'Authentication Required'
            hosts:
            - example.com
            paths:
            - /prometheus(/|$)(.*)
            tls:
            - secretName: lke-monitor-tls
              hosts:
              - example.com
          prometheusSpec:
            routePrefix: /
            externalUrl: https://example.com/prometheus
            storageSpec:
              volumeClaimTemplate:
                spec:
                  storageClassName: linode-block-storage-retain
                  resources:
                    requests:
                      storage: 10Gi
        
        alertmanager:
          ingress:
            enabled: true
            annotations:
              kubernetes.io/ingress.class: nginx
              nginx.ingress.kubernetes.io/rewrite-target: /$2
              cert-manager.io/cluster-issuer: letsencrypt-prod
              nginx.ingress.kubernetes.io/auth-type: basic
              nginx.ingress.kubernetes.io/auth-secret: basic-auth
              nginx.ingress.kubernetes.io/auth-realm: 'Authentication Required'
            hosts:
            - example.com
            paths:
            - /alertmanager(/|$)(.*)
            tls:
            - secretName: lke-monitor-tls
              hosts:
              - example.com
          alertmanagerSpec:
            routePrefix: /
            externalUrl: https://example.com/alertmanager
            storage:
              volumeClaimTemplate:
                spec:
                  storageClassName: linode-block-storage-retain
                  resources:
                    requests:
                      storage: 10Gi
        
        grafana:
          persistence:
            enabled: true
            storageClassName: linode-block-storage-retain
            size: 10Gi
          ingress:
            enabled: true
            annotations:
              kubernetes.io/ingress.class: nginx
              nginx.ingress.kubernetes.io/rewrite-target: /$2
              nginx.ingress.kubernetes.io/auth-type: basic
              nginx.ingress.kubernetes.io/auth-secret: basic-auth
              nginx.ingress.kubernetes.io/auth-realm: 'Authentication Required'
            hosts:
            - example.com
            path: /grafana(/|$)(.*)
            tls:
            - secretName: lke-monitor-tls
              hosts:
              - example.com
          grafana.ini:
            server:
              domain: example.com
              root_url: "%(protocol)s://%(domain)s/grafana/"
              enable_gzip: "true"
        
        # Disable control plane metrics
        kubeEtcd:
          enabled: false
        
        kubeControllerManager:
          enabled: false
        
        kubeScheduler:
          enabled: false
            
      2. Export an environment variable to store your Grafana admin password:

        Note

        Replace prom-operator in the below command with a secure password and save the password for later reference.

        export GRAFANA_ADMINPASSWORD="prom-operator"
        
      3. Using Helm, deploy a Prometheus Operator release labeled lke-monitor in the monitoring namespace on your LKE cluster with the settings established in your values-https-basic-auth.yaml file:

        Note

        If you have already deployed a Prometheus Operator release, you can upgrade it by replacing helm install with helm upgrade in the below command.

        helm install 
        lke-monitor stable/prometheus-operator 
        -f ~/lke-monitor/values-https-basic-auth.yaml 
        --namespace monitoring 
        --set grafana.adminPassword=$GRAFANA_ADMINPASSWORD
        

        Once completed, you will see output similar to the following:

          
        NAME: lke-monitor
        LAST DEPLOYED: Mon Jul 27 17:03:46 2020
        NAMESPACE: monitoring
        STATUS: deployed
        REVISION: 1
        NOTES:
        The Prometheus Operator has been installed. Check its status by running:
          kubectl --namespace monitoring get pods -l "release=lke-monitor"
        
        Visit https://github.com/coreos/prometheus-operator for instructions on how
        to create & configure Alertmanager and Prometheus instances using the Operator.
        
        
      4. Verify that the Prometheus Operator has been deployed to your LKE cluster and its components are running and ready by checking the pods in the monitoring namespace:

        kubectl -n monitoring get pods
        

        You should see a similar output to the following, confirming that you are ready to access your monitoring interfaces using your domain:

          
        NAME                                                     READY   STATUS    RESTARTS   AGE
        alertmanager-lke-monitor-prometheus-ope-alertmanager-0   2/2     Running   0          45s
        lke-monitor-grafana-84cbb54f98-7gqtk                     2/2     Running   0          54s
        lke-monitor-kube-state-metrics-68c56d976f-n587d          1/1     Running   0          54s
        lke-monitor-prometheus-node-exporter-6xt8m               1/1     Running   0          53s
        lke-monitor-prometheus-node-exporter-dkc27               1/1     Running   0          53s
        lke-monitor-prometheus-node-exporter-pkc65               1/1     Running   0          53s
        lke-monitor-prometheus-ope-operator-f87bc9f7c-w56sw      2/2     Running   0          54s
        prometheus-lke-monitor-prometheus-ope-prometheus-0       3/3     Running   1          35s
            
        

      Access Monitoring Interfaces from your Domain

      Your monitoring interfaces are now publicly accessible with HTTPS and basic auth from the domain you have configured for use with this guide at the following paths:

      ResourceDomain and path
      Prometheusexample.com/prometheus
      Alertmanagerexample.com/alertmanager
      Grafanaexample.com/grafana

      When accessing an interface for the first time, log in as admin with the password you configured for basic auth credentials.

      When accessing the Grafana interface, you will then log in again as admin with the password you exported as $GRAFANA_ADMINPASSWORD on your local environment. The Grafana dashboards are accessible at Dashboards > Manage from the left navigation bar.

      More Information

      You may wish to consult the following resources for additional information on this topic. While these are provided in the hope that they will be useful, please note that we cannot vouch for the accuracy or timeliness of externally hosted materials.

      This guide is published under a CC BY-ND 4.0 license.



      Source link

      How to Deploy Prometheus with One-Click Apps


      Updated by Linode

      Contributed by
      Linode

      Use Prometheus to collect metrics and receive alerts with this open-source monitoring tool. Prometheus monitors targets that you define at given intervals by scraping their metrics HTTP endpoints. This tool is particularly well-suited for numeric time series data, which makes it ideal for machine-centric monitoring as well as monitoring of highly dynamic service-oriented architectures.

      Deploy Prometheus with One-Click Apps

      One-Click Apps allow you to easily deploy software on a Linode using the Linode Cloud Manager. To access Linode’s One-Click Apps:

      1. Log in to your Linode Cloud Manager account.

      2. From the Linode dashboard, click on the Create button in the top right-hand side of the screen and select Linode from the dropdown menu.

      3. The Linode creation page will appear. Select the One-Click tab.

      4. Under the Select App section, select the app you would like to deploy:

        Select a One-Click App to deploy

      5. Once you have selected the app, proceed to the app’s Options section and provide values for the required fields.

      Linode Options

      Provide configurations for your Linode server. The table below includes details about each configuration option.

      ConfigurationDescription
      Select an ImageDebian 9 is currently the only image supported by the Prometheus One-Click App, and it is pre-selected on the Linode creation page. Required
      RegionThe region where you would like your Linode to reside. In general, it’s best to choose a location that’s closest to you. For more information on choosing a DC, review the How to Choose a Data Center guide. You can also generate MTR reports for a deeper look at the network routes between you and each of our data centers. Required.
      Linode PlanYour Linode’s hardware resources. Prometheus’ default settings require 3 GBs of memory. We recommend, at minimum, that you start with a 4 GB Linode plan. You can always resize your Linode to a different plan later if you feel you need to increase or decrease your system resources. Required
      Linode LabelThe name for your Linode, which must be unique between all of the Linodes on your account. This name will be how you identify your server in the Cloud Manager’s Dashboard. Required.
      Add TagsA tag to help organize and group your Linode resources. Tags can be applied to Linodes, Block Storage Volumes, NodeBalancers, and Domains.
      Root PasswordThe primary administrative password for your Linode instance. This password must be provided when you log in to your Linode via SSH. It must be at least 6 characters long and contain characters from two of the following categories: lowercase and uppercase case letters, numbers, and punctuation characters. Your root password can be used to perform any action on your server, so make it long, complex, and unique. Required

      When you’ve provided all required Linode Options, click on the Create button. Your Prometheus app will complete installation anywhere between 2-5 minutes after your Linode has finished provisioning.

      Getting Started after Deployment

      Access Your Prometheus Instance

      Now that your Prometheus One-Click App is deployed, you can log into Prometheus to access its expression browser, alerts, status, and more.

      1. Open a browser and navigate to http://192.0.2.0:9090/. Replace 192.0.2.0 with your Linode’s IP address. This will bring you to your Prometheus instance’s expression browser.

      2. Verify that Prometheus is serving metrics by navigating to http://192.0.2.0:9090/metrics. Replace 192.0.2.0 with your Linode’s IP address. You should see a page of metrics similar to the example below.

        Verify that Prometheus is serving metrics by visiting the sample metrics page.

      3. Grafana, the open source analytics and metric visualization tool, supports querying Prometheus. Consider deploying a Grafana instance with One-Click Apps to create visualizations for your Prometheus metrics.

      Prometheus Default Settings

      • Prometheus’ main configuration is located in the /etc/prometheus/prometheus.yml file.
      • This file includes a scrape configuration for Prometheus itself.
      • The scraping interval and evaluation interval are configured globally to be 15s. The scrape_interval parameter defines the time between each Prometheus scrape, while the evaluation_interval parameter is the time between each evaluation of Prometheus’ alerting rules.
      • The Prometheus Node Exporter is added and enabled. This third-party system exporter is used to collect hardware and OS metrics. Your Node Exporter metrics are sent to port 9100 of your Linode.

      More Information

      You may wish to consult the following resources for additional information on this topic. While these are provided in the hope that they will be useful, please note that we cannot vouch for the accuracy or timeliness of externally hosted materials.

      This guide is published under a CC BY-ND 4.0 license.



      Source link

      How to Set Up a Prometheus, Grafana and Alertmanager Monitoring Stack on DigitalOcean Kubernetes


      Introduction

      Along with tracing and logging, monitoring and alerting are essential components of a Kubernetes observability stack. Setting up monitoring for your DigitalOcean Kubernetes cluster allows you to track your resource usage and analyze and debug application errors.

      A monitoring system usually consists of a time-series database that houses metric data and a visualization layer. In addition, an alerting layer creates and manages alerts, handing them off to integrations and external services as necessary. Finally, one or more components generate or expose the metric data that will be stored, visualized, and processed for alerts by the stack.

      One popular monitoring solution is the open-source Prometheus, Grafana, and Alertmanager stack, deployed alongside kube-state-metrics and node_exporter to expose cluster-level Kubernetes object metrics as well as machine-level metrics like CPU and memory usage.

      Rolling out this monitoring stack on a Kubernetes cluster requires configuring individual components, manifests, Prometheus metrics, and Grafana dashboards, which can take some time. The DigitalOcean Kubernetes Cluster Monitoring Quickstart, released by the DigitalOcean Community Developer Education team, contains fully defined manifests for a Prometheus-Grafana-Alertmanager cluster monitoring stack, as well as a set of preconfigured alerts and Grafana dashboards. It can help you get up and running quickly, and forms a solid foundation from which to build your observability stack.

      In this tutorial, we’ll deploy this preconfigured stack on DigitalOcean Kubernetes, access the Prometheus, Grafana, and Alertmanager interfaces, and describe how to customize it.

      Prerequisites

      Before you begin, you’ll need a DigitalOcean Kubernetes cluster available to you, and the following tools installed in your local development environment:

      • The kubectl command-line interface installed on your local machine and configured to connect to your cluster. You can read more about installing and configuring kubectl in its official documentation.
      • The git version control system installed on your local machine. To learn how to install git on Ubuntu 18.04, consult How To Install Git on Ubuntu 18.04.
      • The Coreutils base64 tool installed on your local machine. If you’re using a Linux machine, this will most likely already be installed. If you’re using OS X, you can use openssl base64, which comes installed by default.

      Note: The Cluster Monitoring Quickstart has only been tested on DigitalOcean Kubernetes clusters. To use the Quickstart with other Kubernetes clusters, some modification to the manifest files may be necessary.

      Step 1 — Cloning the GitHub Repository and Configuring Environment Variables

      To start, clone the DigitalOcean Kubernetes Cluster Monitoring GitHub repository onto your local machine using git:

      Then, navigate into the repo:

      You should see the following directory structure:

      Output

      LICENSE README.md changes.txt manifest

      The manifest directory contains Kubernetes manifests for all of the monitoring stack components, including Service Accounts, Deployments, StatefulSets, ConfigMaps, etc. To learn more about these manifest files and how to configure them, skip ahead to Configuring the Monitoring Stack.

      If you just want to get things up and running, begin by setting the APP_INSTANCE_NAME and NAMESPACE environment variables, which will be used to configure a unique name for the stack's components and configure the Namespace into which the stack will be deployed:

      • export APP_INSTANCE_NAME=sammy-cluster-monitoring
      • export NAMESPACE=default

      In this tutorial, we set APP_INSTANCE_NAME to sammy-cluster-monitoring, which will prepend all of the monitoring stack Kubernetes object names. You should substitute in a unique descriptive prefix for your monitoring stack. We also set the Namespace to default. If you’d like to deploy the monitoring stack to a Namespace other than default, ensure that you first create it in your cluster:

      • kubectl create namespace "$NAMESPACE"

      You should see the following output:

      Output

      namespace/sammy created

      In this case, the NAMESPACE environment variable was set to sammy. Throughout the rest of the tutorial we'll assume that NAMESPACE has been set to default.

      Now, use the base64 command to base64-encode a secure Grafana password. Be sure to substitute a password of your choosing for your_grafana_password:

      • export GRAFANA_GENERATED_PASSWORD="$(echo -n 'your_grafana_password' | base64)"

      If you're using macOS, you can substitute the openssl base64 command which comes installed by default.

      At this point, you've grabbed the stack's Kubernetes manifests and configured the required environment variables, so you're now ready to substitute the configured variables into the Kubernetes manifest files and create the stack in your Kubernetes cluster.

      Step 2 — Creating the Monitoring Stack

      The DigitalOcean Kubernetes Monitoring Quickstart repo contains manifests for the following monitoring, scraping, and visualization components:

      • Prometheus is a time series database and monitoring tool that works by polling metrics endpoints and scraping and processing the data exposed by these endpoints. It allows you to query this data using PromQL, a time series data query language. Prometheus will be deployed into the cluster as a StatefulSet with 2 replicas that uses Persistent Volumes with DigitalOcean Block Storage. In addition, a preconfigured set of Prometheus Alerts, Rules, and Jobs will be stored as a ConfigMap. To learn more about these, skip ahead to the Prometheus section of Configuring the Monitoring Stack.
      • Alertmanager, usually deployed alongside Prometheus, forms the alerting layer of the stack, handling alerts generated by Prometheus and deduplicating, grouping, and routing them to integrations like email or PagerDuty. Alertmanager will be installed as a StatefulSet with 2 replicas. To learn more about Alertmanager, consult Alerting from the Prometheus docs.
      • Grafana is a data visualization and analytics tool that allows you to build dashboards and graphs for your metrics data. Grafana will be installed as a StatefulSet with one replica. In addition, a preconfigured set of Dashboards generated by kubernetes-mixin will be stored as a ConfigMap.
      • kube-state-metrics is an add-on agent that listens to the Kubernetes API server and generates metrics about the state of Kubernetes objects like Deployments and Pods. These metrics are served as plaintext on HTTP endpoints and consumed by Prometheus. kube-state-metrics will be installed as an auto-scalable Deployment with one replica.
      • node-exporter, a Prometheus exporter that runs on cluster nodes and provides OS and hardware metrics like CPU and memory usage to Prometheus. These metrics are also served as plaintext on HTTP endpoints and consumed by Prometheus. node-exporter will be installed as a DaemonSet.

      By default, along with scraping metrics generated by node-exporter, kube-state-metrics, and the other components listed above, Prometheus will be configured to scrape metrics from the following components:

      • kube-apiserver, the Kubernetes API server.
      • kubelet, the primary node agent that interacts with kube-apiserver to manage Pods and containers on a node.
      • cAdvisor, a node agent that discovers running containers and collects their CPU, memory, filesystem, and network usage metrics.

      To learn more about configuring these components and Prometheus scraping jobs, skip ahead to Configuring the Monitoring Stack. We'll now substitute the environment variables defined in the previous step into the repo's manifest files, and concatenate the individual manifests into a single master file.

      Begin by using awk and envsubst to fill in the APP_INSTANCE_NAME, NAMESPACE, and GRAFANA_GENERATED_PASSWORD variables in the repo's manifest files. After substituting in the variable values, the files will be combined and saved into a master manifest file called sammy-cluster-monitoring_manifest.yaml.

      • awk 'FNR==1 {print "---"}{print}' manifest/*
      • | envsubst '$APP_INSTANCE_NAME $NAMESPACE $GRAFANA_GENERATED_PASSWORD'
      • > "${APP_INSTANCE_NAME}_manifest.yaml"

      You should consider storing this file in version control so that you can track changes to the monitoring stack and roll back to previous versions. If you do this, be sure to scrub the admin-password variable from the file so that you don't check your Grafana password into version control.

      Now that you've generated the master manifest file, use kubectl apply -f to apply the manifest and create the stack in the Namespace you configured:

      • kubectl apply -f "${APP_INSTANCE_NAME}_manifest.yaml" --namespace "${NAMESPACE}"

      You should see output similar to the following:

      Output

      serviceaccount/alertmanager created configmap/sammy-cluster-monitoring-alertmanager-config created service/sammy-cluster-monitoring-alertmanager-operated created service/sammy-cluster-monitoring-alertmanager created . . . clusterrolebinding.rbac.authorization.k8s.io/prometheus created configmap/sammy-cluster-monitoring-prometheus-config created service/sammy-cluster-monitoring-prometheus created statefulset.apps/sammy-cluster-monitoring-prometheus created

      You can track the stack’s deployment progress using kubectl get all. Once all of the stack components are RUNNING, you can access the preconfigured Grafana dashboards through the Grafana web interface.

      Step 3 — Accessing Grafana and Exploring Metrics Data

      The Grafana Service manifest exposes Grafana as a ClusterIP Service, which means that it's only accessible via a cluster-internal IP address. To access Grafana outside of your Kubernetes cluster, you can either use kubectl patch to update the Service in-place to a public-facing type like NodePort or LoadBalancer, or kubectl port-forward to forward a local port to a Grafana Pod port. In this tutorial we'll forward ports, so you can skip ahead to Forwarding a Local Port to Access the Grafana Service. The following section on exposing Grafana externally is included for reference purposes.

      Exposing the Grafana Service using a Load Balancer (optional)

      If you'd like to create a DigitalOcean Load Balancer for Grafana with an external public IP, use kubectl patch to update the existing Grafana Service in-place to the LoadBalancer Service type:

      • kubectl patch svc "$APP_INSTANCE_NAME-grafana"
      • --namespace "$NAMESPACE"
      • -p '{"spec": {"type": "LoadBalancer"}}'

      The kubectl patch command allows you to update Kubernetes objects in-place to make changes without having to re-deploy the objects. You can also modify the master manifest file directly, adding a type: LoadBalancer parameter to the Grafana Service spec. To learn more about kubectl patch and Kubernetes Service types, you can consult the Update API Objects in Place Using kubectl patch and Services resources in the official Kubernetes docs.

      After running the above command, you should see the following:

      Output

      service/sammy-cluster-monitoring-grafana patched

      It may take several minutes to create the Load Balancer and assign it a public IP. You can track its progress using the following command with the -w flag to watch for changes:

      • kubectl get service "$APP_INSTANCE_NAME-grafana" -w

      Once the DigitalOcean Load Balancer has been created and assigned an external IP address, you can fetch its external IP using the following commands:

      • SERVICE_IP=$(kubectl get svc $APP_INSTANCE_NAME-grafana
      • --namespace $NAMESPACE
      • --output jsonpath='{.status.loadBalancer.ingress[0].ip}')
      • echo "http://${SERVICE_IP}/"

      You can now access the Grafana UI by navigating to http://SERVICE_IP/.

      Forwarding a Local Port to Access the Grafana Service

      If you don't want to expose the Grafana Service externally, you can also forward local port 3000 into the cluster directly to a Grafana Pod using kubectl port-forward.

      • kubectl port-forward --namespace ${NAMESPACE} ${APP_INSTANCE_NAME}-grafana-0 3000

      You should see the following output:

      Output

      Forwarding from 127.0.0.1:3000 -> 3000 Forwarding from [::1]:3000 -> 3000

      This will forward local port 3000 to containerPort 3000 of the Grafana Pod sammy-cluster-monitoring-grafana-0. To learn more about forwarding ports into a Kubernetes cluster, consult Use Port Forwarding to Access Applications in a Cluster.

      Visit http://localhost:3000 in your web browser. You should see the following Grafana login page:

      Grafana Login Page

      To log in, use the default username admin (if you haven't modified the admin-user parameter), and the password you configured in Step 1.

      You'll be brought to the following Home Dashboard:

      Grafana Home Page

      In the left-hand navigation bar, select the Dashboards button, then click on Manage:

      Grafana Dashboard Tab

      You'll be brought to the following dashboard management interface, which lists the dashboards configured in the dashboards-configmap.yaml manifest:

      Grafana Dashboard List

      These dashboards are generated by kubernetes-mixin, an open-source project that allows you to create a standardized set of cluster monitoring Grafana dashboards and Prometheus alerts. To learn more, consult the kubernetes-mixin GitHub repo.

      Click in to the Kubernetes / Nodes dashboard, which visualizes CPU, memory, disk, and network usage for a given node:

      Grafana Nodes Dashboard

      Describing how to use these dashboards is outside of this tutorial’s scope, but you can consult the following resources to learn more:

      In the next step, we'll follow a similar process to connect to and explore the Prometheus monitoring system.

      Step 4 — Accessing Prometheus and Alertmanager

      To connect to the Prometheus Pods, we can use kubectl port-forward to forward a local port. If you’re done exploring Grafana, you can close the port-forward tunnel by hitting CTRL-C. Alternatively, you can open a new shell and create a new port-forward connection.

      Begin by listing running Pods in the default namespace:

      • kubectl get pod -n default

      You should see the following Pods:

      Output

      sammy-cluster-monitoring-alertmanager-0 1/1 Running 0 17m sammy-cluster-monitoring-alertmanager-1 1/1 Running 0 15m sammy-cluster-monitoring-grafana-0 1/1 Running 0 16m sammy-cluster-monitoring-kube-state-metrics-d68bb884-gmgxt 2/2 Running 0 16m sammy-cluster-monitoring-node-exporter-7hvb7 1/1 Running 0 16m sammy-cluster-monitoring-node-exporter-c2rvj 1/1 Running 0 16m sammy-cluster-monitoring-node-exporter-w8j74 1/1 Running 0 16m sammy-cluster-monitoring-prometheus-0 1/1 Running 0 16m sammy-cluster-monitoring-prometheus-1 1/1 Running 0 16m

      We are going to forward local port 9090 to port 9090 of the sammy-cluster-monitoring-prometheus-0 Pod:

      • kubectl port-forward --namespace ${NAMESPACE} sammy-cluster-monitoring-prometheus-0 9090

      You should see the following output:

      Output

      Forwarding from 127.0.0.1:9090 -> 9090 Forwarding from [::1]:9090 -> 9090

      This indicates that local port 9090 is being forwarded successfully to the Prometheus Pod.

      Visit http://localhost:9090 in your web browser. You should see the following Prometheus Graph page:

      Prometheus Graph Page

      From here you can use PromQL, the Prometheus query language, to select and aggregate time series metrics stored in its database. To learn more about PromQL, consult Querying Prometheus from the official Prometheus docs.

      In the Expression field, type kubelet_node_name and hit Execute. You should see a list of time series with the metric kubelet_node_name that reports the Nodes in your Kubernetes cluster. You can see which node generated the metric and which job scraped the metric in the metric labels:

      Prometheus Query Results

      Finally, in the top navigation bar, click on Status and then Targets to see the list of targets Prometheus has been configured to scrape. You should see a list of targets corresponding to the list of monitoring endpoints described at the beginning of Step 2.

      To learn more about Prometheus and how to query your cluster metrics, consult the official Prometheus docs.

      To connect to Alertmanager, which manages Alerts generated by Prometheus, we'll follow a similar process to what we used to connect to Prometheus. . In general, you can explore Alertmanager Alerts by clicking into Alerts in the Prometheus top navigation bar.

      To connect to the Alertmanager Pods, we will once again use kubectl port-forward to forward a local port. If you’re done exploring Prometheus, you can close the port-forward tunnel by hitting CTRL-Cor open a new shell to create a new connection. .

      We are going to forward local port 9093 to port 9093 of the sammy-cluster-monitoring-alertmanager-0 Pod:

      • kubectl port-forward --namespace ${NAMESPACE} sammy-cluster-monitoring-alertmanager-0 9093

      You should see the following output:

      Output

      Forwarding from 127.0.0.1:9093 -> 9093 Forwarding from [::1]:9093 -> 9093

      This indicates that local port 9093 is being forwarded successfully to an Alertmanager Pod.

      Visit http://localhost:9093 in your web browser. You should see the following Alertmanager Alerts page:

      Alertmanager Alerts Page

      From here, you can explore firing alerts and optionally silencing them. To learn more about Alertmanager, consult the official Alertmanager documentation.

      In the next step, you'll learn how to optionally configure and scale some of the monitoring stack components.

      Step 6 — Configuring the Monitoring Stack (optional)

      The manifests included in the DigitalOcean Kubernetes Cluster Monitoring Quickstart repository can be modified to use different container images, different numbers of Pod replicas, different ports, and customized configuration files.

      In this step, we'll provide a high-level overview of each manifest’s purpose, and then demonstrate how to scale Prometheus up to 3 replicas by modifying the master manifest file.

      To begin, navigate into the manifests subdirectory in the repo, and list the directory’s contents:

      Output

      alertmanager-0serviceaccount.yaml alertmanager-configmap.yaml alertmanager-operated-service.yaml alertmanager-service.yaml . . . node-exporter-ds.yaml prometheus-0serviceaccount.yaml prometheus-configmap.yaml prometheus-service.yaml prometheus-statefulset.yaml

      Here you'll find manifests for the different monitoring stack components. To learn more about specific parameters in the manifests, click into the links and consult the comments included throughout the YAML files:

      Alertmanager

      Grafana

      kube-state-metrics

      node-exporter

      Prometheus

      • prometheus-0serviceaccount.yaml: The Prometheus Service Account, ClusterRole and ClusterRoleBinding.
      • prometheus-configmap.yaml: A ConfigMap that contains three configuration files:

        • alerts.yaml: Contains a preconfigured set of alerts generated by kubernetes-mixin (which was also used to generate the Grafana dashboards). To learn more about configuring alerting rules, consult Alerting Rules from the Prometheus docs.
        • prometheus.yaml: Prometheus's main configuration file. Prometheus has been preconfigured to scrape all the components listed at the beginning of Step 2. Configuring Prometheus goes beyond the scope of this article, but to learn more, you can consult Configuration from the official Prometheus docs.
        • rules.yaml: A set of Prometheus recording rules that enable Prometheus to compute frequently needed or computationally expensive expressions, and save their results as a new set of time series. These are also generated by kubernetes-mixin, and configuring them goes beyond the scope of this article. To learn more, you can consult Recording Rules from the official Prometheus documentation.
      • prometheus-service.yaml: The Service that exposes the Prometheus StatefulSet.

      • prometheus-statefulset.yaml: The Prometheus StatefulSet, configured with 2 replicas. This parameter can be scaled depending on your needs.

      Example: Scaling Prometheus

      To demonstrate how to modify the monitoring stack, we'll scale the number of Prometheus replicas from 2 to 3.

      Open the sammy-cluster-monitoring_manifest.yaml master manifest file using your editor of choice:

      • nano sammy-cluster-monitoring_manifest.yaml

      Scroll down to the Prometheus StatefulSet section of the manifest:

      Output

      . . . apiVersion: apps/v1beta2 kind: StatefulSet metadata: name: sammy-cluster-monitoring-prometheus labels: &Labels k8s-app: prometheus app.kubernetes.io/name: sammy-cluster-monitoring app.kubernetes.io/component: prometheus spec: serviceName: "sammy-cluster-monitoring-prometheus" replicas: 2 podManagementPolicy: "Parallel" updateStrategy: type: "RollingUpdate" selector: matchLabels: *Labels template: metadata: labels: *Labels spec: . . .

      Change the number of replicas from 2 to 3:

      Output

      . . . apiVersion: apps/v1beta2 kind: StatefulSet metadata: name: sammy-cluster-monitoring-prometheus labels: &Labels k8s-app: prometheus app.kubernetes.io/name: sammy-cluster-monitoring app.kubernetes.io/component: prometheus spec: serviceName: "sammy-cluster-monitoring-prometheus" replicas: 3 podManagementPolicy: "Parallel" updateStrategy: type: "RollingUpdate" selector: matchLabels: *Labels template: metadata: labels: *Labels spec: . . .

      When you're done, save and close the file.

      Apply the changes using kubectl apply -f:

      • kubectl apply -f sammy-cluster-monitoring_manifest.yaml --namespace default

      You can track progress using kubectl get pods. Using this same technique, you can update many of the Kubernetes parameters and much of the configuration for this observability stack.

      Conclusion

      In this tutorial, you installed a Prometheus, Grafana, and Alertmanager monitoring stack into your DigitalOcean Kubernetes cluster with a standard set of dashboards, Prometheus rules, and alerts.

      You may also choose to deploy this monitoring stack using the Helm Kubernetes package manager. To learn more, consult How to Set Up DigitalOcean Kubernetes Cluster Monitoring with Helm and Prometheus. One additional way to get this stack up and running is to use the DigitalOcean Marketplace Kubernetes Monitoring Stack solution, currently in beta.

      The DigitalOcean Kubernetes Cluster Monitoring Quickstart repository is heavily based on and modified from Google Cloud Platform’s click-to-deploy Prometheus solution. A full manifest of modifications and changes from the original repository can be found in the Quickstart repo’s changes.txt file.



      Source link