eBPF-Based Network Observability: Exploring Cilium Hubble and Alternatives

eBPF (Extended Berkeley Packet Filter) is a revolutionary Linux kernel technology that allows safe, efficient, and dynamic execution of custom programs directly within the kernel. By leveraging eBPF, modern observability tools can monitor and analyze system behavior without the overhead or risks associated with traditional instrumentation methods.

Why eBPF?

Low Overhead: eBPF runs directly in the kernel, minimizing the performance impact on the system while providing high-fidelity insights.
Dynamic Instrumentation: Modify monitoring and tracing logic without rebooting or recompiling the kernel.
Deep Visibility: Monitor system calls, network traffic, and process behavior in real time.

This technology has become the cornerstone of cloud-native observability and security, offering unparalleled insights into workloads and infrastructure. Cilium Hubble builds upon eBPF to deliver an exceptional observability solution tailored for Kubernetes environments.

What is Cilium?

Cilium is open source software for transparently securing the network connectivity between application services deployed using Linux container management platforms like Docker and Kubernetes.

At the foundation of Cilium is a new Linux kernel technology called eBPF, which enables the dynamic insertion of powerful security visibility and control logic within Linux itself. Because eBPF runs inside the Linux kernel, Cilium security policies can be applied and updated without any changes to the application code or container configuration.

What is Hubble ?

Architecture of cilium hubble (source: isovalent.com)

Architecture of cilium hubble (source: isovalent.com)

Hubble is a fully distributed networking and security observability platform. It is built on top of Cilium and eBPF to enable deep visibility into the communication and behavior of services as well as the networking infrastructure in a completely transparent manner.

By building on top of Cilium, Hubble can leverage eBPF for visibility. By relying on eBPF, all visibility is programmable and allows for a dynamic approach that minimizes overhead while providing deep and detailed visibility as required by users. Hubble has been created and specifically designed to make best use of these new eBPF powers.

Component Overview

Component of cilium and hubble (source: cilium.io)

Component of cilium and hubble (source: cilium.io)

Cilium Components

Agent

The Cilium agent (cilium-agent) operates on every node in the Kubernetes cluster. It manages network configurations, service load-balancing, network policies, and monitoring by processing input from Kubernetes or APIs. This agent monitors orchestration systems like Kubernetes for container and workload events, ensuring seamless networking setup and teardown. Additionally, it manages eBPF programs within the Linux kernel, which control all ingress and egress traffic for the containers.

CLI Client

The Cilium CLI client (cilium) is a command-line tool installed alongside the Cilium agent. It interacts with the local agent's REST API to provide insights into its state and functionality. The CLI also allows direct access to eBPF maps, enabling users to inspect and validate their configurations.

Operator

The Cilium Operator manages cluster-wide tasks, ensuring that certain responsibilities are handled centrally rather than at the node level. Although not involved in the critical paths for forwarding or enforcing network policies, the operator is essential for functions like IP address management (IPAM). Temporary unavailability of the operator may lead to:

Delayed allocation of IP addresses, which can postpone workload scheduling.
Missed updates to the kvstore heartbeat key, causing agents to perceive the kvstore as unhealthy and restart.

CNI Plugin

The CNI plugin (cilium-cni) is triggered by Kubernetes whenever a pod is scheduled or terminated on a node. It communicates with the node's Cilium API to configure networking, load-balancing, and network policies required for the pod's operation.

Hubble Components

Server

The Hubble server, integrated into the Cilium agent, collects visibility data using eBPF. This tight integration ensures high performance and low overhead. It provides gRPC services for accessing flow events and supports Prometheus monitoring for metrics collection and visualization..

Relay

Hubble Relay (hubble-relay) is a standalone component designed for cluster-wide observability. It connects to the gRPC APIs of all Hubble servers in the cluster, aggregating their data into a unified API for comprehensive visibility.

CLI Client

The Hubble CLI (hubble) connects to either a local Hubble server or the Hubble Relay via gRPC to fetch flow data and events.

Graphical UI

The Hubble UI (hubble-ui) leverages the relay's visibility features to present an intuitive graphical interface. It provides service dependency diagrams and connectivity maps to help visualize the cluster's network flow.

Key Features

Real-Time Network Observability
- Visualize and monitor network flows, service dependencies, and traffic patterns.
- Detect issues like packet drops, flow latency, or misconfigured network policies immediately.
Kubernetes-Native Insights
- Gain observability using Kubernetes-native identities like pods, namespaces, and labels.
- Understand inter-service communication patterns with a focus on microservices architecture.
Application Layer Visibility
- Observe DNS queries, HTTP traffic, and TCP flows for enhanced debugging.
- Enforce Layer 7 policies with visibility into application-layer data.
Seamless Integration with Monitoring Tools
- Export metrics to Prometheus and visualize them in Grafana dashboards.
- Use Hubble’s UI for an intuitive, graphical representation of network traffic.
Security Policy Monitoring
- Ensure compliance and security by monitoring policy enforcement.
- Use identity-based security to secure workloads dynamically.

Challenges and Limitations

While Cilium Hubble offers cutting-edge features, there are certain challenges to consider:

Technical Expertise
- Setting up and managing Cilium Hubble requires knowledge of Kubernetes, eBPF, and networking concepts.
Kubernetes Dependency
- Cilium Hubble is designed specifically for Kubernetes environments, limiting its use in non-containerized workloads.

Hands-On Demo: Setting Up Cilium Hubble for Network Observability

Prerequisites

Kind CLI: To create a local Kubernetes cluster.
Helm: For installing Cilium and Prometheus/Grafana.
kubectl: To interact with the cluster.

Step 1: Setting Up the Kind Cluster

Install Kind CLI if not already installed.

Create a file named kind.yaml with the following content:

kind: Cluster
apiVersion: kind.x-k8s.io/v1alpha4
nodes:
  - role: control-plane
    image: kindest/node:v1.32.0@sha256:c48c62eac5da28cdadcf560d1d8616cfa6783b58f0d94cf63ad1bf49600cb027
  - role: worker
    image: kindest/node:v1.32.0@sha256:c48c62eac5da28cdadcf560d1d8616cfa6783b58f0d94cf63ad1bf49600cb027
  - role: worker
    image: kindest/node:v1.32.0@sha256:c48c62eac5da28cdadcf560d1d8616cfa6783b58f0d94cf63ad1bf49600cb027
yaml

Create the cluster:

kind create cluster --config kind.yaml --name kind-cluster
bash

Your Kind cluster with two worker nodes and one control-plane node is now ready.

Step 2: Installing Prometheus and Grafana

Add the Prometheus Helm repository:

helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
helm repo update
bash

Install the Prometheus stack:

helm install kube-prometheus-stack prometheus-community/kube-prometheus-stack --namespace monitoring --create-namespace
bash

Access Grafana:
```
kubectl port-forward svc/kube-prometheus-stack-grafana 3000:80
bash
```
Open your browser and navigate to http://localhost:3000.
- Username: admin
- Password: prom-operator

Step 3: Installing Cilium Hubble

Add and install the Cilium Helm chart:

helm install cilium cilium/cilium --version 1.16.5 \
  --namespace kube-system \
  --set prometheus.enabled=true \
  --set operator.prometheus.enabled=true \
  --set hubble.enabled=true \
  --set hubble.relay.enabled=true \
  --set hubble.ui.enabled=true \
  --set hubble.metrics.enableOpenMetrics=true \
  --set hubble.metrics.enabled="{dns,drop,tcp,flow,flow-to-world,port-distribution,icmp,httpV2:exemplars=true;labelsContext=source_ip\,source_namespace\,source_workload\,destination_ip\,destination_namespace\,destination_workload\,traffic_direction}"
bash

Customize metrics by modifying the Helm values:

helm get values cilium --namespace kube-system --output yaml > cilium.yaml
bash

Add the following under metrics.enabled:

flows-to-world:labelsContext=source_namespace,source_app,destination_namespace,destination_app
flow:labelsContext=source_namespace,source_app,destination_namespace,destination_app
yaml

Reapply the changes and restart the Cilium agent:

helm upgrade cilium cilium/cilium --namespace kube-system -f cilium.yaml && \
kubectl rollout restart daemonset/cilium -n kube-system && \
kubectl rollout restart daemonset/cilium-envoy -n kube-system
bash

Step 4: Configuring Prometheus to Scrape Metrics

Export the Prometheus Helm chart values:

helm show values prometheus-community/kube-prometheus-stack -n monitoring > values.yaml
bash

Add the following scrape configurations under additionalScrapeConfigs in values.yaml:

- job_name: 'kubernetes-pods'
  kubernetes_sd_configs:
    - role: pod
  relabel_configs:
    - source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_scrape]
      action: keep
      regex: true
    - source_labels:
        [__address__, __meta_kubernetes_pod_annotation_prometheus_io_port]
      action: replace
      regex: ([^:]+)(?::\d+)?;(\d+)
      replacement: ${1}:${2}
      target_label: __address__

- job_name: 'kubernetes-endpoints'
  scrape_interval: 30s
  kubernetes_sd_configs:
    - role: endpoints
  relabel_configs:
    - source_labels:
        [__meta_kubernetes_service_annotation_prometheus_io_scrape]
      action: keep
      regex: true
    - source_labels:
        [__address__, __meta_kubernetes_service_annotation_prometheus_io_port]
      action: replace
      target_label: __address__
      regex: (.+)(?::\d+);(\d+)
      replacement: $1:$2
yaml

Apply the changes:

helm upgrade kube-prometheus-stack prometheus-community/kube-prometheus-stack -n monitoring -f values.yaml
bash

Testing

To understand the power of Cilium Hubble in action, we will deploy a sample application in our Kubernetes cluster and monitor its network traffic. This practical step will demonstrate how Hubble provides real-time observability into communication patterns, network policies, and traffic flows.

Pod to Pod Communication

Deploy applications for testing Pod A to Pod B and Pod B to Pod A communication

kubectl apply -f - <<EOF
apiVersion: v1
kind: Service
metadata:
  name: service-a
spec:
  selector:
    app: pod-a
  ports:
    - protocol: TCP
      port: 80
      targetPort: 80
---
apiVersion: v1
kind: Service
metadata:
  name: service-b
spec:
  selector:
    app: pod-b
  ports:
    - protocol: TCP
      port: 80
      targetPort: 80
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: deployment-a
spec:
  replicas: 1
  selector:
    matchLabels:
      app: pod-a
  template:
    metadata:
      labels:
        app: pod-a
    spec:
      containers:
      - name: pod-a-container
        image: curlimages/curl
        command: ["sh", "-c", "while true; do curl http://service-b:80;sleep 5; done"]
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: deployment-b
spec:
  replicas: 1
  selector:
    matchLabels:
      app: pod-b
  template:
    metadata:
      labels:
        app: pod-b
    spec:
      containers:
      - name: pod-b-container
        image: nginx
EOF
yaml

External World to Pod Communication

To enable communication from the external world to your Kubernetes pods, we utilize a LoadBalancer service with kind's cloud-provider-kind.

Steps:

Install the kind cloud provider:

go install sigs.k8s.io/cloud-provider-kind@latest
sudo cloud-provider-kind start
bash

Apply the example manifest:

kubectl apply -f https://kind.sigs.k8s.io/examples/loadbalancer/usage.yaml
bash

Retrieve the external IP:

LB_IP=$(kubectl get svc/foo-service -o=jsonpath='{.status.loadBalancer.ingress[0].ip}')
bash

Test the communication:

for _ in {1..10}; do curl ${LB_IP}:5678; done
bash

Pod to External World Communication

To validate that a pod can communicate with the external world, deploy a pod using a lightweight Alpine container to periodically ping a public domain like google.com. Additionally, configure a NetworkPolicy to explicitly allow egress traffic.

Apply the Deployment Manifest

kubectl apply -f - <<EOF
apiVersion: apps/v1
kind: Deployment
metadata:
  name: test-connectivity
  labels:
    app: test-connectivity
spec:
  replicas: 1
  selector:
    matchLabels:
      app: test-connectivity
  template:
    metadata:
      labels:
        app: test-connectivity
    spec:
      containers:
        - name: ping-container
          image: alpine:latest # Lightweight image with ping available
          command: ["/bin/sh", "-c"]
          args:
            - while true; do
                echo "Pinging google.com...";
                ping -c 5 google.com;
                echo "Sleeping for 30 seconds...";
                sleep 30;
              done;
EOF
yaml

Validate the Availability of Metrics Using Grafana

Open Grafana and navigate to the Explore section.
Search for metrics with prefixes like hubble_, cilium_, or envoy_.
Validate the metrics relevant to your use case and create dashboards accordingly.

Create Dashboards to Monitor Traffic

Pod A to Pod B Communication

sum by(source_app, destination_app, destination_namespace, source_namespace)
(rate(hubble_flows_processed_total{source_app="pod-a", destination_app="pod-b", source_namespace="default", destination_namespace="default"}[$__rate_interval]))
promql

Pod B to Pod A Communication

sum by(source_app, destination_app, destination_namespace, source_namespace)
(rate(hubble_flows_processed_total{source_app="pod-b", destination_app="pod-a"}[$__rate_interval]))
promql

Pod to External World Communication

sum by(source_app)
(rate(hubble_flows_to_world_total{source_app="test-connectivity", source_namespace="default"}[$__rate_interval]))
promql

External World to Pod Communication
```
sum by(destination_app)
(rate(hubble_flows_processed_total{destination_namespace="default", destination_app="http-echo"}[$__rate_interval]))
promql
```
Dashboard

Alternatives

1. Calico

Pricing Model: Open-source with optional commercial support.

Pros:
- Provides load balancing and in-kernel security enforcement.
- Widely used in Kubernetes environments.
Cons:
- Primarily focused on networking rather than comprehensive observability.
License Model: Open-source licensing with commercial options.

2. Pixie

Pricing Model: Open-source with potential commercial features for enterprise users.
Pros:
- Automatically captures telemetry data without manual instrumentation.
- Provides high-level service maps and detailed application traffic views.
Cons:
- May not offer as much low-level network detail as other tools.
License Model: Open-source with optional enterprise features.

3. Kubeshark

Pricing Model: Open-source community edition with proprietary enterprise options.

Pros: Provides real-time protocol-level visibility into Kubernetes traffic.
Cons: Primarily focused on Kubernetes; may not be suitable for non-containerized applications.
License Model: Open-sourcewith enterprise licensing for advanced features.

4. OpenTelemetry Network (OpenTelemetry Network Monitoring)

Pricing Model: Open-source, supported by a vibrant community and cloud providers.
Pros:
- Unified framework for collecting, processing, and exporting metrics, logs, and traces.
- Provides visibility into network traffic as part of broader observability.
- Vendor-neutral, integrates seamlessly with Prometheus, Grafana, and Jaeger.
Cons:
- Requires careful setup to capture eBPF-based network telemetry.
- May not offer as detailed protocol-level traffic analysis compared to dedicated tools like Kubeshark.
License Model: Open-source under the CNCF (Cloud Native Computing Foundation).

Conclusion

Due to increasing threats and various compliance requirements, we should monitor our east-west and north-south traffic in the Kubernetes clusters. eBPF-based tools like Cilium Hubble are revolutionizing network observability in Kubernetes, offering deep visibility into traffic, service dependencies, and security. In the cloud-native era, adopting eBPF-based solutions is essential for managing the complexity of Kubernetes while ensuring security and performance.

Enhance Your Kubernetes Observability with eBPF

Discover how eBPF-based solutions can enhance your Kubernetes environment and security—get started with the right tool for your needs.

eBPF-Based Network Observability: Exploring Cilium Hubble and Alternatives

Why eBPF?

What is Cilium?

What is Hubble ?

Component Overview

Cilium Components

Agent

CLI Client

Operator

CNI Plugin

Hubble Components

Server

Relay

CLI Client

Graphical UI

Key Features

Challenges and Limitations

Hands-On Demo: Setting Up Cilium Hubble for Network Observability

Prerequisites

Step 1: Setting Up the Kind Cluster

Step 2: Installing Prometheus and Grafana

Step 3: Installing Cilium Hubble

Step 4: Configuring Prometheus to Scrape Metrics

Testing

Pod to Pod Communication

External World to Pod Communication

Steps:

Pod to External World Communication

Apply the Deployment Manifest

Validate the Availability of Metrics Using Grafana

Create Dashboards to Monitor Traffic

Alternatives

1. Calico

2. Pixie

3. Kubeshark

4. OpenTelemetry Network (OpenTelemetry Network Monitoring)

Conclusion

Other posts that you might like

Top Open Source Logging Tools for Cloud Native Observability

CI/CD Observability using OpenTelemetry

Expert Guide on Selecting Observability Products

Enjoying this post?