Hands-on with G123 scalable ELK stack deployment on Kubernetes (AWS EKS)

Troubleshooting and fixing issues by analyzing log data is a major topic in our multi-cluster(cloud) system design. Unlike databases or log files in other services, each service/application in such a multi-cluster design works in its own environment, which means that they don’t share resources in the same way. Therefore, it is always a pain for SRE team to get notified or locate the root cause for issues regarding cross-service dependency. In G123, we have hundreds of online services (including backend services and batch processes) running in various cloud providers (AWS and Alicloud). A centralized logging approach is especially important for data aggregation, processing, storage, and analysis among our engineering and SRE team.

We, in G123, choose ELK as the centralized logging system solution. ELK(stands for Elasticsearch, Logstash, and Kibana) stack is one of the most popular open-source log collection, storage, and analytics platform, which has been embraced by Netflix, LinkedIn, Twitter, etc.

Elasticsearch: Open-source, full-text search and analysis engine, based on the Apache Lucene search engine.

Logstash: Server‑side data processing pipeline that ingests data from multiple sources simultaneously, transforms and sends to a “stash” like Elasticsearch.

Kibana: Visualizes Elasticsearch data and navigates the Elastic Stack. Do anything from tracking query load to understanding the way requests flow through applications.

ELK deployment on the container-based platform is constantly evolving. Recently, they came up with a Kubernetes opertor based deployment solution for ELK stack. In this blog, we will walk through our architecture solution and illustrate how we deploy the fully scalable ELK stack on G123 Kubernetes cluster based on the Kubernetes operator. We will also go over how we gather logs across different G123 services, including Kubernetes system and AWS services.

Log Sources

In our target list, we have three categories of logs:

Kubernetes system log: generated by the Kubernetes cluster itself, i.e, from Kubernetes system pods (kube-proxy, coredns, etc.)
Application log: generated by the application or services deployed by developers.
AWS service log: generated by each AWS services we are using.

Architecture

For Kubernetes system log and application log, we use Filebeat (from beats family released by elastic.co) to collect them from Kubernetes nodes, and then send them to Logstash, from where we can further process and enhance them before sending to Elasticsearch cluster, and visualizing in Kibana UI eventually.

elk_full

Beats reference (Image from: https://www.elastic.co/guide/en/beats/libbeat/current/beats-reference.html)

Comparison between Filebeat and Logstash( Table reference from: https://www.educba.com/filebeat-vs-logstash/ )

Attributes	Filebeat	Logstash
plugins	It is decentralized.	The centralized management is provided in Logstash.
Performance	It consumes only minimum memory.	It uses an elastic beat for tails and leaves and consumes more memory storage.
Transport and Traffic Management	It has built-in reliability.	It is deployed with Redis for enhanced reliability.

Based on performance and dependability, we choose Filebeat over Logstash to gather Kubernetes system log from each cluster node, transmit logs to logstash for further processing, then centralize in an elasticsearch cluster. On the other hand, we use logstash to collect logs directly from AWS services. Figure below illustrates the overall architecture of our deployment, in which there’re two types of log collection agents: Logstash and Filebeat, each of which is responsible for collecting different sorts of log documents.

elk_full_v3

Ther are three types of nodes/pods in the Elasticsearch cluster:

Data node(pod): stores data and executes data-related operation search, etc.
Master node(pod): in charge of cluster-wide management and configuration.
Ingestion node(pod): for pre-processing documents before indexing.

In this blog, we’ll demostrate an elasticsearch cluster with 3 master nodes and 3 data nodes, the data node plays the same role as ingest node in our deployment case.

Prerequisite

Let’s take a look at the prerequisites before we begin with the deployment

Kubernetes cluster

All we need is one operational Kubernetes cluster. We use an AWS EKS cluster with 5 m5.xlarge spot nodes labeled with elk namespaces for our deployment.

Deployment

Deploy Kubernetes Operator

The first step is to download the Kubernetes operator from official repo and apply to our cluster.

$ kubectl apply -f https://download.elastic.co/downloads/eck/1.1.0/all-in-one.yaml
customresourcedefinition.apiextensions.k8s.io/apmservers.apm.k8s.elastic.co created
customresourcedefinition.apiextensions.k8s.io/elasticsearches.elasticsearch.k8s.elastic.co created
customresourcedefinition.apiextensions.k8s.io/kibanas.kibana.k8s.elastic.co created
clusterrole.rbac.authorization.k8s.io/elastic-operator created
clusterrolebinding.rbac.authorization.k8s.io/elastic-operator created
namespace/elastic-system created
statefulset.apps/elastic-operator created
serviceaccount/elastic-operator created
validatingwebhookconfiguration.admissionregistration.k8s.io/elastic-webhook.k8s.elastic.co created
service/elastic-webhook-server created
secret/elastic-webhook-server-cert created

Deploy Elasticsearch Cluster

After that, we can create the Elasticseach cluster. We have 3 master nodes, and 3 data nodes configured (data node also has the role of ingest node), the code block below is the deployment yaml file (saved as elasticsearch.yaml).

apiVersion: elasticsearch.k8s.elastic.co/v1
kind: Elasticsearch
metadata:
  name: elastic-cluster
  namespace: elastic-system
spec:
  version: 7.13.0
  nodeSets:
  - name: master
    count: 3
    config:
      node.master: true
      node.data: false
      node.ingest: false
      xpack.ml.enabled: true
      node.store.allow_mmap: false
    podTemplate:
      spec:
        nodeSelector:
          dedicated: elk
        tolerations:
        - effect: NoSchedule
          key: dedicated
          operator: Equal
          value: elk
        initContainers:
        - name: sysctl
          securityContext:
            privileged: true
          command: ['sh', '-c', 'sysctl -w vm.max_map_count=262144']
        containers:
        - name: elasticsearch
          env:
          - name: ES_JAVA_OPTS
            value: -Xms2g -Xmx2g
          resources:
            requests:
              memory: 2Gi
              cpu: 1
            limits:
              memory: 8Gi
              cpu: 2
    volumeClaimTemplates:
    - metadata:
        name: elasticsearch-data
      spec:
        accessModes:
        - ReadWriteOnce
        resources:
          requests:
            storage: 100Gi
        storageClassName: gp2
  - name: data
    count: 3
    config:
      node.master: false
      node.data: true
      node.ingest: true
      xpack.ml.enabled: true
      node.store.allow_mmap: false
    podTemplate:
      spec:
        nodeSelector:
          dedicated: elk
        tolerations:
        - effect: NoSchedule
          key: dedicated
          operator: Equal
          value: elk
        initContainers:
        - name: sysctl
          securityContext:
            privileged: true
          command: ['sh', '-c', 'sysctl -w vm.max_map_count=262144']
        containers:
        - name: elasticsearch
          env:
          - name: ES_JAVA_OPTS
            value: -Xms2g -Xmx2g
          resources:
            requests:
              memory: 2Gi
              cpu: 1
            limits:
              memory: 8Gi
              cpu: 2
    volumeClaimTemplates:
    - metadata:
        name: elasticsearch-data
      spec:
        accessModes:
        - ReadWriteOnce
        resources:
          requests:
            storage: 100Gi
        storageClassName: gp2

In above deployment file, we have some customizations:

Setting xpack.ml.enabled: true , we can utilize machine learning apis. but if the CPU does not support SSE4.2, we need to disable this. (SSE4.2 are supported on Intel Core i7 (“Nehalem”), Intel Atom (Silvermont core), AMD Bulldozer, AMD Jaguar, and later processors).

The nodeSelector part is following our dedicated node labeling: --node-labels=dedicated=elk

Pod resource specification according to the usage.

Then apply the yaml file in cluster

1	$ kubectl apply -f elasticsearch.yaml

After this, we can perform a health checking on services and pods within the cluster.

# Verify the cluster status
$ kubectl get elastic -n elastic-system
NAME                                                         HEALTH   NODES   VERSION   PHASE   AGE
elasticsearch.elasticsearch.k8s.elastic.co/elastic-cluster   green    6       7.13.0    Ready   7h
# Verify the pvc it creates when deploy to the cluster
$ kubectl get pods -n elastic-system
NAME                                            READY   STATUS    RESTARTS   AGE
elastic-cluster-es-data-0                       1/1     Running   0          7h
elastic-cluster-es-data-1                       1/1     Running   0          7h
elastic-cluster-es-data-2                       1/1     Running   0          7h
elastic-cluster-es-master-0                     1/1     Running   0          7h
elastic-cluster-es-master-1                     1/1     Running   0          7h
elastic-cluster-es-master-2                     1/1     Running   0          7h
elastic-operator-0                              1/1     Running   0          7h
$ kubectl get service -n elastic-system
NAME                           TYPE           CLUSTER-IP       EXTERNAL-IP                                                                             PORT(S)          AGE
elastic-cluster-es-data        ClusterIP      None             <none>                                                                                  9200/TCP         7h
elastic-cluster-es-http        ClusterIP      172.20.163.155   <none>                                                                                  9200/TCP         7h
elastic-cluster-es-master      ClusterIP      None             <none>                                                                                  9200/TCP         7h
elastic-cluster-es-transport   ClusterIP      None             <none>                                                                                  9300/TCP         7h
elastic-webhook-server         ClusterIP      172.20.23.185    <none>                                                                                  443/TCP          7h

There’ll be a password generated for the default admin account elastic, and we can get the password by following command.

1 2	$ PASSWORD=$(kubectl get secret elastic-cluster-es-elastic-user -n elastic-system -o=jsonpath='{.data.elastic}' \| base64 --decode) $ echo $PASSWORD

Check Elasticsearch service from following steps:

Forward Elasticsearch service port to our local machine:

1
2
3

$ kubectl port-forward service/elastic-cluster-es-http 9200 -n elastic-system
Forwarding from 127.0.0.1:9200 -> 9200
Forwarding from [::1]:9200 -> 9200

curl 9200 port in local machine with the generated user name and password, if the status is green, proceed to next step.

$ PASSWORD=$(kubectl get secret elastic-cluster-es-elastic-user -n elastic-system -o=jsonpath='{.data.elastic}' | base64 --decode)
$ curl -XGET -u "elastic:$PASSWORD" -k 'https://localhost:9200/_cluster/health?level=shards&pretty'
{
  "cluster_name" : "elastic-cluster",
  "status" : "green",                 <------ the Elasticsearch cluster status
  "timed_out" : false,
  "number_of_nodes" : 6,
  "number_of_data_nodes" : 3,
  "active_primary_shards" : 79,
  "active_shards" : 118,
  "relocating_shards" : 0,
  "initializing_shards" : 0,
  "unassigned_shards" : 0,
  "delayed_unassigned_shards" : 0,
  "number_of_pending_tasks" : 0,
  "number_of_in_flight_fetch" : 0,
  "task_max_waiting_in_queue_millis" : 0,
  "active_shards_percent_as_number" : 100.0,
  "indices" : {
    "logs-221998" : {
      "status" : "green",        <------ the index status
      "number_of_shards" : 5,
      "number_of_replicas" : 0,
      "active_primary_shards" : 5,
      "active_shards" : 5,
      "relocating_shards" : 0,
      "initializing_shards" : 0,
      "unassigned_shards" : 0,
      "shards" : {

Here are explanations about Elasticsearch clusters from official document:

green: All shards are assigned.

yellow: All primary shards are assigned, but one or more replica shards are unassigned. If a node in the cluster fails, some data could be unavailable until that node is repaired.

red: One or more primary shards are unassigned, so some data is unavailable. This can occur briefly during cluster startup as primary shards are assigned.

Deploy Kibana

Once we get the Elasticsearch cluster up and running, we can now deploy the Kibana application for data visualization. Below is the deployment file (kibana.yaml) .

apiVersion: kibana.k8s.elastic.co/v1
kind: Kibana
metadata:
  name: kibana-cluster
  namespace: elastic-system
spec:
  version: 7.13.0
  count: 1
  elasticsearchRef:
    name: elastic-cluster
  http:
    tls:
      selfSignedCertificate:
        disabled: true
  podTemplate:
    spec:
      containers:
      - name: kibana
        resources:
          limits:
            memory: 8Gi
            cpu: 2
          requests:
            cpu: "1"
            memory: 4Gi
      nodeSelector:
        dedicated: elk
      tolerations:
      - effect: NoSchedule
        key: dedicated
        operator: Equal
        value: elk

In our case, horizontal pod auto scale function is also used to monitor our Kibana pod, which scales the pod into multiple instances if the resource consumption threshold is reached(measured by memory usage).

HorizontalPodAutoscaler.yaml file for our kibana deployment:

apiVersion: autoscaling/v2beta1
kind: HorizontalPodAutoscaler
metadata:
  name: kibana
  namespace: elastic-system
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: kibana-cluster-kb
  minReplicas: 1
  maxReplicas: 4
  metrics:
  - type: Resource
    resource:
      name: cpu
      targetAverageUtilization: 70
  - type: Resource
    resource:
      name: memory
      targetAverageUtilization: 70

After applying above two yaml files, the service will run in the cluster.

1
2
3

$ kubectl get svc -n elastic-system | grep kibana
NAME                           TYPE           CLUSTER-IP       EXTERNAL-IP                                                                             PORT(S)          AGE
kibana-cluster-kb-http         ClusterIP      172.20.201.58    <none>                                                                                  5601/TCP         7h

By now we can check the kibana services and visit the service in the deploy machine. First, forward service to the deploy machine using port-forward.

1
2
3

$ kubectl port-forward service/kibana-cluster-kb-http 5601 -n elastic-system
Forwarding from 127.0.0.1:5601 -> 5601
Forwarding from [::1]:5601 -> 5601

Visit http://localhost:5601 in the local browser. The Kibana UI will show after we input the user name elastic and password generated from Elasticsearch deployment process(Step 2).

Until now, we have a running Elasticsearch cluster and a Kibana application. In next section, we’ll deploy the log collection part: Logstash and Filebeat.

Deploy Logstash

The first step for Logstash deployment is creating a configmap in Kubernetes to store the configurations. (logstash_cm.yaml)

apiVersion: v1
kind: ConfigMap
metadata:
  name: logstash-configmap
  namespace: elastic-system
data:
  logstash.yml: |
    http.host: "0.0.0.0"
    path.config: /usr/share/logstash/pipeline    
  logstash.conf: |
    input {
      beats {
        port => 5044
        include_codec_tag => false
      }
    }

    output {
      elasticsearch {
        ilm_enabled => false
        hosts => ["https://elastic-cluster-es-http:9200"]
        user => 'elastic'
        password => '${LOGSTASH_PW}'
        index => "logstash-%{+YYYY.MM.dd}"
        ssl => false
        ssl_certificate_verification => false
        cacert => "/usr/share/logstash/ca/ca.crt"
      }
    }

Then we will create the deployment file(logstash_deployment.yaml) for the Logstash service. The configmap is mounted as a volume inside the Logstash pod.

apiVersion: apps/v1
kind: Deployment
metadata:
  name: logstash-deployment
  namespace: elastic-system
spec:
  replicas: 1
  selector:
    matchLabels:
      app: logstash
  template:
    metadata:
      labels:
        app: logstash
    spec:
      ### UPDATE NODE SELECTOR ACCORDING TO THE CLUSTER SETTING
      nodeSelector: 
        dedicated: elk
      tolerations:
      - effect: NoSchedule
        key: dedicated
        operator: Equal
        value: elk
      containers:
      - name: logstash
        env:
          - name: LOGSTASH_PW
            valueFrom:
              secretKeyRef:
                name: elastic-cluster-es-elastic-user
                key: elastic
        image: docker.elastic.co/logstash/logstash:7.13.0
        ports:
        - containerPort: 5044
        volumeMounts:
          - name: config-volume
            mountPath: /usr/share/logstash/config
          - name: logstash-pipeline-volume
            mountPath: /usr/share/logstash/pipeline
          - name: certs
            mountPath: "/usr/share/logstash/ca"
            readOnly: true
        resources:
            ###  UPDATE LOGSTASH SPEC ACCORDING TO THE REQUIREMENT
            limits:
              memory: "4Gi"
              cpu: "2000m"
            requests: 
              memory: "2Gi"
              cpu: "500m"
      volumes:
      - name: config-volume
        configMap:
          name: logstash-configmap
          items:
            - key: logstash.yml
              path: logstash.yml
      - name: logstash-pipeline-volume
        configMap:
          name: logstash-configmap
          items:
            - key: logstash.conf
              path: logstash.conf
      - name: certs
        secret:
          secretName: elastic-cluster-es-http-certs-public

The last step is creating a Logstash service deployment file (logstash_svc.yaml), this will generate an extenal-ip address for cross cluster service discovery(let Filebeat find this Logstash from other cluster nodes).

kind: Service
apiVersion: v1
metadata:
  name: logstash-service
  namespace: elastic-system
  annotations:
        service.beta.kubernetes.io/aws-load-balancer-internal: "true"
spec:
  selector:
    app: logstash
  ports:
  - protocol: TCP
    port: 5044
    targetPort: 5044
  type: LoadBalancer

After apply all the above three yaml files in Kubernetes, we can get the Logstash service information, the extenal-ip address will be used in Filebeat configuration part in next step.

1	$ kubectl get svc -n elastic-system \| grep logstash-service

Below is the output from our deployment.

logstash-svc

We’re not going to introduce all kinds of AWS log collection practices in this blog. Below is one example to explain how we collect AWS Redshift audit logs which were saved to AWS S3 bucket, the configuration of how to save audit log to S3 is in this manual.

In our Logstash configuration file(logstash_cm_aws.yaml), we use the S3 input plugin to monitor bucket changes in S3, extract log documents to the Elasticsearch cluster from Logstash.

logstash_cm_aws.yaml

apiVersion: v1
kind: ConfigMap
metadata:
  name: logstash-redshift-configmap
  namespace: elastic-system
data:
  logstash.yml: |
    http.host: "0.0.0.0"
    path.config: /usr/share/logstash/pipeline    
  logstash.conf: |
    input {
      beats {
        port => 5044
        include_codec_tag => false
      }
      s3 {
        "access_key_id" => "${AWS_ACCESS_KEY_ID}"
        "secret_access_key" => "${AWS_SECRET_ACCESS_KEY}"
        "bucket" => "g123-production-redshift-logs"
        "region" => "ap-northeast-1"
        "prefix" => "redshift/"
        additional_settings => {
          "force_path_style" => true
          "follow_redirects" => false
        }
        codec => multiline {
          pattern => "(?m)\s*^'[0-9]{4}-[0-9]{2}-[0-9]{2}.*"
          negate => true
          what => previous
        }
      }
    }

    filter {
      mutate {
        add_tag => [ "g123-production-redshift" ]
      }
    }

    output {
      elasticsearch {
        ilm_enabled => false
        hosts => ["https://elastic-cluster-es-http:9200"]
        user => 'elastic'
        password => '${LOGSTASH_PW}'
        index => "logstash-redshift-%{+YYYY.MM.dd}"
        ssl => false
        ssl_certificate_verification => false
        cacert => "/usr/share/logstash/ca/ca.crt"
      }
    }

And below is the logstash deployment file(logstash_deployment_aws.yaml).

apiVersion: apps/v1
kind: Deployment
metadata:
  name: logstash-redshift-deployment
  namespace: elastic-system
spec:
  replicas: 1
  selector:
    matchLabels:
      app: logstash-redshift
  template:
    metadata:
      labels:
        app: logstash-redshift
    spec:
      nodeSelector:
        dedicated: elk
      tolerations:
      - effect: NoSchedule
        key: dedicated
        operator: Equal
        value: elk
      containers:
      - name: logstash-redshift
        env:
          - name: LOGSTASH_PW
            valueFrom:
              secretKeyRef:
                name: elastic-cluster-es-elastic-user
                key: elastic
          - name: AWS_ACCESS_KEY_ID
            valueFrom:
              secretKeyRef:
                name: aws-access-key-id
                key: aws_access_key_id
          - name: AWS_SECRET_ACCESS_KEY
            valueFrom:
              secretKeyRef:
                name: aws-secret-access-key
                key: aws_secret_access_key
        image: docker.elastic.co/logstash/logstash:7.13.0
        ports:
        - containerPort: 5044
        volumeMounts:
          - name: config-volume
            mountPath: /usr/share/logstash/config
          - name: logstash-redshift-pipeline-volume
            mountPath: /usr/share/logstash/pipeline
          - name: certs
            mountPath: "/usr/share/logstash/ca"
            readOnly: true
        resources:
            limits:
              memory: "4Gi"
              cpu: "2000m"
            requests: 
              memory: "2Gi"
              cpu: "500m"
      volumes:
      - name: config-volume
        configMap:
          name: logstash-redshift-configmap
          items:
            - key: logstash.yml
              path: logstash.yml
      - name: logstash-redshift-pipeline-volume
        configMap:
          name: logstash-redshift-configmap
          items:
            - key: logstash.conf
              path: logstash.conf
      - name: certs
        secret:
          secretName: elastic-cluster-es-http-certs-public

After deploying above two yaml files, we can view and check Redshift logs.

Deploy Filebeat

We use following Filebeat deployment file(filebeat.yaml) to collect log from our Kubernetes cluster. In the output.logstash: part, we will use the logstash service external-ip which generated in previous step.

---
apiVersion: v1
kind: ConfigMap
metadata:
  name: filebeat-config
  namespace: kube-system
  labels:
    k8s-app: filebeat
data:
  filebeat.yml: |-
    filebeat.inputs:
    - type: container
      paths:
        - /var/log/containers/*.log
      processors:
        - add_kubernetes_metadata:
            host: ${NODE_NAME}
            matchers:
            - logs_path:
                logs_path: "/var/log/containers/"
        - add_tags:
            tags: [internal-k8s]
  
    output.logstash:
      hosts: ["internal-XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX.amazonaws.com:5044"]
    setup.template.name: "filebeat"
    setup.template.pattern: "filebeat-*"
---
apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: filebeat
  namespace: kube-system
  labels:
    k8s-app: filebeat
spec:
  selector:
    matchLabels:
      k8s-app: filebeat
  template:
    metadata:
      labels:
        k8s-app: filebeat
    spec:
      nodeSelector:
        kubernetes.io/os: linux
      tolerations:
        - effect: NoSchedule
          key: dedicated
          operator: Exists
      serviceAccountName: filebeat
      terminationGracePeriodSeconds: 30
      hostNetwork: true
      dnsPolicy: ClusterFirstWithHostNet
      containers:
      - name: filebeat
        image: docker.elastic.co/beats/filebeat:7.13.0
        args: [
          "-c", "/etc/filebeat.yml",
          "-e",
        ]
        env:
        - name: NODE_NAME
          valueFrom:
            fieldRef:
              fieldPath: spec.nodeName
        securityContext:
          runAsUser: 0
          # If using Red Hat OpenShift uncomment this:
          #privileged: true
        resources:
          limits:
            memory: 200Mi
          requests:
            cpu: 100m
            memory: 100Mi
        volumeMounts:
        - name: config
          mountPath: /etc/filebeat.yml
          readOnly: true
          subPath: filebeat.yml
        - name: data
          mountPath: /usr/share/filebeat/data
        - name: varlibdockercontainers
          mountPath: /var/lib/docker/containers
          readOnly: true
        - name: varlog
          mountPath: /var/log
          readOnly: true
      volumes:
      - name: config
        configMap:
          defaultMode: 0640
          name: filebeat-config
      - name: varlibdockercontainers
        hostPath:
          path: /var/lib/docker/containers
      - name: varlog
        hostPath:
          path: /var/log
      # data folder stores a registry of read status for all files, so we don't send everything again on a Filebeat pod restart
      - name: data
        hostPath:
          # When filebeat runs as non-root user, this directory needs to be writable by group (g+w).
          path: /var/lib/filebeat-data
          type: DirectoryOrCreate
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: filebeat
subjects:
- kind: ServiceAccount
  name: filebeat
  namespace: kube-system
roleRef:
  kind: ClusterRole
  name: filebeat
  apiGroup: rbac.authorization.k8s.io
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: filebeat
  labels:
    k8s-app: filebeat
rules:
- apiGroups: [""] # "" indicates the core API group
  resources:
  - namespaces
  - pods
  - nodes
  verbs:
  - get
  - watch
  - list
- apiGroups: ["apps"]
  resources:
    - replicasets
  verbs: ["get", "list", "watch"]
---
apiVersion: v1
kind: ServiceAccount
metadata:
  name: filebeat
  namespace: kube-system
  labels:
    k8s-app: filebeat
---

After running kubectl apply -f filebeat.yaml in our cluster, we can see the Filebeat daemonset is deployed in all our Kubernetes nodes (There’re 2 more nodes in the cluster except the ELK nodes).

1
2
3

$ kubectl get daemonsets -n kube-system | grep filebeat
NAME                           DESIRED   CURRENT   READY   UP-TO-DATE   AVAILABLE   NODE SELECTOR            AGE
filebeat                       8         8         8       8            8           kubernetes.io/os=linux   7h

And in the Kibana UI, we can see the Kubernetes system logs are correctly collected.

Conclusion

In this blog, we introduced the architecture of our centerlized logging system, and the procedures to deploy a fully functional ELK stack in a Kubernetes cluster(AWS EKS in our case). Detailed procedures of setting up Logstash and Filebeat, as well as collecting logs from an AWS service were also covered.

Join Us