Deliver Kubernetes application logs to ELK with filebeat

4 minute read Published:

One of the problems you may face while running applications in Kubernetes cluster is how to gain knowledge of what is going on. Running kubectl logs is fine if you run a few nodes, but as the cluster grows you need to be able to view and query your logs from a centralized location.

If you have an Elastic Stack in place you can run a logging agent – filebeat for instance – as DaemonSet and securely deliver your application logs from Kubernetes cluster to Logstash. Logs that are written to stdout/stderr by applications are picked up by the docker engine and saved under /var/lib/docker/containers in json format. Filebeat can be configured to read from these files and deliver the messages to your stack.

A DaemonSet ensures that all (or some) nodes run a copy of a pod. As nodes are added to the cluster, pods are added to them, and as nodes are removed from the cluster, those pods are garbage collected.

In this example, filebeat will be running in a pod as a DaemonSet on all nodes in the cluster and deliver the application logs to Logstash.

Let's assume Logstash is expecting connections from filebeat at port 5514:

# https://gist.github.com/arslanm/b0dbae19e3371a8e92d774f330babdc8
input {
  # ...
  # filebeat access at port 5514
  beats {
    port => 5514
    type => beats
    ssl => true
    ssl_certificate_authorities => [ "/etc/logstash/ssl/cacert.crt" ]
    ssl_certificate => "/etc/logstash/ssl/server.crt"
    ssl_key => "/etc/logstash/ssl/server.key"
    ssl_key_passphrase => "password"
    ssl_verify_mode => force_peer
  }
  # ...
}

First we need to add a filter to Logstash to parse the messages delivered by filebeat:

# https://gist.github.com/arslanm/f9662c7db400587fa362ccb15d0b18a0
filter {
  # ...
  # logs delivered by filebeat
  if [program] == "filebeat_k8s" {
    mutate {
      rename => { "log" => "message" }
    }
    date {
      match => ["time", "ISO8601"]
      remove_field => ["time"]
    }
    grok {
      match => { "source" => "/var/log/containers/%{DATA:k8s_pod}_%{DATA:k8s_namespace}_%{GREEDYDATA:k8s_service}-%{DATA:k8s_container_id}.log" }
      remove_field => ["source"]
    }
  }
  # ...
}

Now that Logstash is ready for filebeat let's create a Secret object to store the SSL CA, Client Certificate and the Private Key which will be used by filebeat to secure the connection with Logstash.

# https://gist.github.com/arslanm/8edbf82da0d7de0f41ebe80f109076c6
cat <<EOF | kubectl create -f -
apiVersion: v1
kind: Secret
metadata:
  name: filebeatssl
  namespace: kube-system
type: Opaque
data:
  cacert: $(cat cacert.crt | base64 | tr -d '\n')
  cert: $(cat client.crt | base64 | tr -d '\n')
  key: $(cat client.key | base64 | tr -d '\n')
EOF

Base64 encoded contents of cacert.crt, client.crt, and client.key will be stored as cacert, cert, and key. Once filebeat pods running on each node have mounted this Secret object as a directory cacert, cert, and key will be accessible as regular files in it.

Next step is creating a docker image for filebeat to run on. Although there are a number of public images that can be used for filebeat DaemonSet, you may want to control the filebeat configuration that is built into the image.

Save below configuration as filebeat.yml.

# https://gist.github.com/arslanm/4a71e1b12c1f5dcc3f7ac736fe483f41
filebeat.registry_file: /var/log/filebeat.registry

logging.level: ${LOG_LEVEL:error}

# if you want cloud metadata included in your elastic
processors:
- add_cloud_metadata:

fields_under_root: true
fields:
  program: filebeat_k8s
  hostname: ${FILEBEAT_HOST:${HOSTNAME}}

filebeat.prospectors:
  - input_type: log
    paths:
      - /var/log/containers/*.log
    symlinks: true
    tail_files: true
    json.message_key: log
    json.keys_under_root: true
    json.add_error_key: true
    multiline.pattern: '^[[:space:]]'
    multiline.match: after

output.logstash:
  hosts: [ "${LOGSTASH_HOST}:${LOGSTASH_PORT}" ]
  ssl.enabled: true
  ssl.certificate_authorities: ["/etc/ssl/cacert"]
  ssl.certificate: "/etc/ssl/cert"
  ssl.key: "/etc/ssl/key"
  ssl.key_passphrase: "password"
  ssl.verification_mode: full
  ssl.supported_protocols: [TLSv1.1, TLSv1.2]

ssl options seen above point to a directory /etc/ssl where our certificate secrets we stored earlier will be available when we create the filebeat DaemonSet.

Filebeat will process log files ending with .log that are located in /var/log/containers directory. This directory will be made available to filebeat pods when we create the DaemonSet.

Next, the docker image. Save below as Dockerfile:

# https://gist.github.com/arslanm/40ac24d2d1a48797600e95f4497fd236
FROM centos:6

RUN curl -L -O https://artifacts.elastic.co/downloads/beats/filebeat/filebeat-5.5.0-x86_64.rpm && \
    rpm -i filebeat-5.5.0-x86_64.rpm && \
    rm -f filebeat-5.5.0-x86_64.rpm && \
    mkdir -p /etc/filebeat

COPY filebeat.yml /etc/filebeat/filebeat.yml

RUN chmod 600 /etc/filebeat/filebeat.yml

ENTRYPOINT ["/usr/bin/filebeat.sh", "-e", "-v"]
CMD ["-c", "/etc/filebeat/filebeat.yml"]

To build the image and push it to your registry save below as Makefile and run make.

# https://gist.github.com/arslanm/50bf98624961191e44ff38fcdc148834
all: build push

REGISTRY=your.registry.address
IMAGE=filebeat
TAG=5.5.0

build:
      	docker build -t $(IMAGE):$(TAG) -t $(REGISTRY)/$(IMAGE):$(TAG) .
        docker build -t $(IMAGE):$(TAG) -t $(REGISTRY)/$(IMAGE):latest .

push:
     	docker push $(REGISTRY)/$(IMAGE):$(TAG)
        docker push $(REGISTRY)/$(IMAGE):latest

Now on to DaemonSet. With hostPath /var/log and /var/lib/docker/containers that reside on nodes are made available to filebeat pods. Why two directories? The json formatted log messages in /var/log/containers are symlinked to /var/log/pods which are symlinked to /var/lib/docker/containers. Filebeat pods must have access to both directories in order to read from the log files.

Save below file as filebeat-daemonset.yml:

# https://gist.github.com/arslanm/7607034481d41a337f608379dfada96f
apiVersion: extensions/v1beta1
kind: DaemonSet
metadata:
  name: filebeat
  namespace: kube-system
  labels:
    app: filebeat
    k8s-app: filebeat
    kubernetes.io/cluster-service: "true"
    kubernetes.io/name: "filebeat"
spec:
  selector:
    matchLabels:
      app: filebeat
      k8s-app: filebeat
  template:
    metadata:
      labels:
        app: filebeat
        k8s-app: filebeat
        kubernetes.io/cluster-service: "true"
        kubernetes.io/name: "filebeat"
    spec:
      containers:
      - name: filebeat
        image: your.registry.address/filebeat:latest
        imagePullPolicy: Always
        resources:
          limits:
            memory: 200Mi
          requests:
            cpu: 100m
            memory: 200Mi
        env:
            - name: LOGSTASH_HOST
              value: "your.logstash.address"
            - name: LOGSTASH_PORT
              value: "5514"
        volumeMounts:
        - name: varlog
          mountPath: /var/log
        - name: varlibdockercontainers
          mountPath: /var/lib/docker/containers
          readOnly: true
        - name: filebeatssl
          mountPath: /etc/ssl
          readOnly: true
      terminationGracePeriodSeconds: 30
      volumes:
      - name: varlog
        hostPath:
          path: /var/log
      - name: varlibdockercontainers
        hostPath:
          path: /var/lib/docker/containers
      - name: filebeatssl
        secret:
          secretName: filebeatssl

SSL secrets we stored earlier will be made available under /etc/ssl per filebeatssl definitions seen above.

Run kubectl create -f filebeat-daemonset.yml to create the DaemonSet. You can verify that the pods are in running state, one pod per node:

$ kubectl get pods --namespace=kube-system --selector=app=filebeat
NAMESPACE     NAME             READY     STATUS    RESTARTS   AGE
kube-system   filebeat-17g9g   1/1       Running   0          43s
kube-system   filebeat-4ckh7   1/1       Running   0          43s
kube-system   filebeat-7d0jr   1/1       Running   0          43s
kube-system   filebeat-8vll4   1/1       Running   0          43s
kube-system   filebeat-cxcfm   1/1       Running   0          43s
kube-system   filebeat-gvgh2   1/1       Running   0          43s
kube-system   filebeat-lc5sm   1/1       Running   0          43s