Skip to main content
ebpf_network_observability_upv1gd.webp

eBPF-Based Network Observability: Exploring Cilium Hubble and Alternatives

Anish Bista

Anish Bista


eBPF (Extended Berkeley Packet Filter) is a revolutionary Linux kernel technology that allows safe, efficient, and dynamic execution of custom programs directly within the kernel. By leveraging eBPF, modern observability tools can monitor and analyze system behavior without the overhead or risks associated with traditional instrumentation methods.

Why eBPF?

  • Low Overhead: eBPF runs directly in the kernel, minimizing the performance impact on the system while providing high-fidelity insights.
  • Dynamic Instrumentation: Modify monitoring and tracing logic without rebooting or recompiling the kernel.
  • Deep Visibility: Monitor system calls, network traffic, and process behavior in real time.

This technology has become the cornerstone of cloud-native observability and security, offering unparalleled insights into workloads and infrastructure. Cilium Hubble builds upon eBPF to deliver an exceptional observability solution tailored for Kubernetes environments.

What is Cilium?

Cilium is open source software for transparently securing the network connectivity between application services deployed using Linux container management platforms like Docker and Kubernetes.

At the foundation of Cilium is a new Linux kernel technology called eBPF, which enables the dynamic insertion of powerful security visibility and control logic within Linux itself. Because eBPF runs inside the Linux kernel, Cilium security policies can be applied and updated without any changes to the application code or container configuration.

What is Hubble ?

Architecture of cilium hubble
Architecture of cilium hubble (source: isovalent.com)

Hubble is a fully distributed networking and security observability platform. It is built on top of Cilium and eBPF to enable deep visibility into the communication and behavior of services as well as the networking infrastructure in a completely transparent manner.

By building on top of Cilium, Hubble can leverage eBPF for visibility. By relying on eBPF, all visibility is programmable and allows for a dynamic approach that minimizes overhead while providing deep and detailed visibility as required by users. Hubble has been created and specifically designed to make best use of these new eBPF powers.

Component Overview

Component of cilium and hubble
Component of cilium and hubble (source: cilium.io)

Cilium Components

Agent

The Cilium agent (cilium-agent) operates on every node in the Kubernetes cluster. It manages network configurations, service load-balancing, network policies, and monitoring by processing input from Kubernetes or APIs. This agent monitors orchestration systems like Kubernetes for container and workload events, ensuring seamless networking setup and teardown. Additionally, it manages eBPF programs within the Linux kernel, which control all ingress and egress traffic for the containers.

CLI Client

The Cilium CLI client (cilium) is a command-line tool installed alongside the Cilium agent. It interacts with the local agent's REST API to provide insights into its state and functionality. The CLI also allows direct access to eBPF maps, enabling users to inspect and validate their configurations.

Operator

The Cilium Operator manages cluster-wide tasks, ensuring that certain responsibilities are handled centrally rather than at the node level. Although not involved in the critical paths for forwarding or enforcing network policies, the operator is essential for functions like IP address management (IPAM). Temporary unavailability of the operator may lead to:

  • Delayed allocation of IP addresses, which can postpone workload scheduling.
  • Missed updates to the kvstore heartbeat key, causing agents to perceive the kvstore as unhealthy and restart.

CNI Plugin

The CNI plugin (cilium-cni) is triggered by Kubernetes whenever a pod is scheduled or terminated on a node. It communicates with the node's Cilium API to configure networking, load-balancing, and network policies required for the pod's operation.

Hubble Components

Server

The Hubble server, integrated into the Cilium agent, collects visibility data using eBPF. This tight integration ensures high performance and low overhead. It provides gRPC services for accessing flow events and supports Prometheus monitoring for metrics collection and visualization..

Relay

Hubble Relay (hubble-relay) is a standalone component designed for cluster-wide observability. It connects to the gRPC APIs of all Hubble servers in the cluster, aggregating their data into a unified API for comprehensive visibility.

CLI Client

The Hubble CLI (hubble) connects to either a local Hubble server or the Hubble Relay via gRPC to fetch flow data and events.

Graphical UI

The Hubble UI (hubble-ui) leverages the relay's visibility features to present an intuitive graphical interface. It provides service dependency diagrams and connectivity maps to help visualize the cluster's network flow.

Key Features

  1. Real-Time Network Observability

    • Visualize and monitor network flows, service dependencies, and traffic patterns.
    • Detect issues like packet drops, flow latency, or misconfigured network policies immediately.
  2. Kubernetes-Native Insights

    • Gain observability using Kubernetes-native identities like pods, namespaces, and labels.
    • Understand inter-service communication patterns with a focus on microservices architecture.
  3. Application Layer Visibility

    • Observe DNS queries, HTTP traffic, and TCP flows for enhanced debugging.
    • Enforce Layer 7 policies with visibility into application-layer data.
  4. Seamless Integration with Monitoring Tools

    • Export metrics to Prometheus and visualize them in Grafana dashboards.
    • Use Hubble’s UI for an intuitive, graphical representation of network traffic.
  5. Security Policy Monitoring

    • Ensure compliance and security by monitoring policy enforcement.
    • Use identity-based security to secure workloads dynamically.

Challenges and Limitations

While Cilium Hubble offers cutting-edge features, there are certain challenges to consider:

  1. Technical Expertise

    • Setting up and managing Cilium Hubble requires knowledge of Kubernetes, eBPF, and networking concepts.
  2. Kubernetes Dependency

    • Cilium Hubble is designed specifically for Kubernetes environments, limiting its use in non-containerized workloads.

Hands-On Demo: Setting Up Cilium Hubble for Network Observability

Prerequisites

  • Kind CLI: To create a local Kubernetes cluster.
  • Helm: For installing Cilium and Prometheus/Grafana.
  • kubectl: To interact with the cluster.

Step 1: Setting Up the Kind Cluster

  1. Install Kind CLI if not already installed.

  2. Create a file named kind.yaml with the following content:

    kind: Cluster apiVersion: kind.x-k8s.io/v1alpha4 nodes: - role: control-plane image: kindest/node:v1.32.0@sha256:c48c62eac5da28cdadcf560d1d8616cfa6783b58f0d94cf63ad1bf49600cb027 - role: worker image: kindest/node:v1.32.0@sha256:c48c62eac5da28cdadcf560d1d8616cfa6783b58f0d94cf63ad1bf49600cb027 - role: worker image: kindest/node:v1.32.0@sha256:c48c62eac5da28cdadcf560d1d8616cfa6783b58f0d94cf63ad1bf49600cb027
    yaml
  3. Create the cluster:

    kind create cluster --config kind.yaml --name kind-cluster
    bash

Your Kind cluster with two worker nodes and one control-plane node is now ready.

Step 2: Installing Prometheus and Grafana

  • Add the Prometheus Helm repository:

    helm repo add prometheus-community https://prometheus-community.github.io/helm-charts helm repo update
    bash
  • Install the Prometheus stack:

    helm install kube-prometheus-stack prometheus-community/kube-prometheus-stack --namespace monitoring --create-namespace
    bash
  • Access Grafana:

    kubectl port-forward svc/kube-prometheus-stack-grafana 3000:80
    bash

    Open your browser and navigate to http://localhost:3000.

    • Username: admin
    • Password: prom-operator

Step 3: Installing Cilium Hubble

  • Add and install the Cilium Helm chart:

    helm install cilium cilium/cilium --version 1.16.5 \ --namespace kube-system \ --set prometheus.enabled=true \ --set operator.prometheus.enabled=true \ --set hubble.enabled=true \ --set hubble.relay.enabled=true \ --set hubble.ui.enabled=true \ --set hubble.metrics.enableOpenMetrics=true \ --set hubble.metrics.enabled="{dns,drop,tcp,flow,flow-to-world,port-distribution,icmp,httpV2:exemplars=true;labelsContext=source_ip\,source_namespace\,source_workload\,destination_ip\,destination_namespace\,destination_workload\,traffic_direction}"
    bash
  • Customize metrics by modifying the Helm values:

    helm get values cilium --namespace kube-system --output yaml > cilium.yaml
    bash

    Add the following under metrics.enabled:

    flows-to-world:labelsContext=source_namespace,source_app,destination_namespace,destination_app flow:labelsContext=source_namespace,source_app,destination_namespace,destination_app
    yaml
  • Reapply the changes and restart the Cilium agent:

    helm upgrade cilium cilium/cilium --namespace kube-system -f cilium.yaml && \ kubectl rollout restart daemonset/cilium -n kube-system && \ kubectl rollout restart daemonset/cilium-envoy -n kube-system
    bash

Step 4: Configuring Prometheus to Scrape Metrics

  • Export the Prometheus Helm chart values:

    helm show values prometheus-community/kube-prometheus-stack -n monitoring > values.yaml
    bash
  • Add the following scrape configurations under additionalScrapeConfigs in values.yaml:

    - job_name: 'kubernetes-pods' kubernetes_sd_configs: - role: pod relabel_configs: - source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_scrape] action: keep regex: true - source_labels: [__address__, __meta_kubernetes_pod_annotation_prometheus_io_port] action: replace regex: ([^:]+)(?::\d+)?;(\d+) replacement: ${1}:${2} target_label: __address__ - job_name: 'kubernetes-endpoints' scrape_interval: 30s kubernetes_sd_configs: - role: endpoints relabel_configs: - source_labels: [__meta_kubernetes_service_annotation_prometheus_io_scrape] action: keep regex: true - source_labels: [__address__, __meta_kubernetes_service_annotation_prometheus_io_port] action: replace target_label: __address__ regex: (.+)(?::\d+);(\d+) replacement: $1:$2
    yaml
  • Apply the changes:

    helm upgrade kube-prometheus-stack prometheus-community/kube-prometheus-stack -n monitoring -f values.yaml
    bash

Testing

To understand the power of Cilium Hubble in action, we will deploy a sample application in our Kubernetes cluster and monitor its network traffic. This practical step will demonstrate how Hubble provides real-time observability into communication patterns, network policies, and traffic flows.

Pod to Pod Communication

Deploy applications for testing Pod A to Pod B and Pod B to Pod A communication

kubectl apply -f - <<EOF apiVersion: v1 kind: Service metadata: name: service-a spec: selector: app: pod-a ports: - protocol: TCP port: 80 targetPort: 80 --- apiVersion: v1 kind: Service metadata: name: service-b spec: selector: app: pod-b ports: - protocol: TCP port: 80 targetPort: 80 --- apiVersion: apps/v1 kind: Deployment metadata: name: deployment-a spec: replicas: 1 selector: matchLabels: app: pod-a template: metadata: labels: app: pod-a spec: containers: - name: pod-a-container image: curlimages/curl command: ["sh", "-c", "while true; do curl http://service-b:80;sleep 5; done"] --- apiVersion: apps/v1 kind: Deployment metadata: name: deployment-b spec: replicas: 1 selector: matchLabels: app: pod-b template: metadata: labels: app: pod-b spec: containers: - name: pod-b-container image: nginx EOF
yaml

External World to Pod Communication

To enable communication from the external world to your Kubernetes pods, we utilize a LoadBalancer service with kind's cloud-provider-kind.

Steps:

  • Install the kind cloud provider:

    go install sigs.k8s.io/cloud-provider-kind@latest sudo cloud-provider-kind start
    bash
  • Apply the example manifest:

    kubectl apply -f https://kind.sigs.k8s.io/examples/loadbalancer/usage.yaml
    bash
  • Retrieve the external IP:

    LB_IP=$(kubectl get svc/foo-service -o=jsonpath='{.status.loadBalancer.ingress[0].ip}')
    bash
  • Test the communication:

    for _ in {1..10}; do curl ${LB_IP}:5678; done
    bash

Pod to External World Communication

To validate that a pod can communicate with the external world, deploy a pod using a lightweight Alpine container to periodically ping a public domain like google.com. Additionally, configure a NetworkPolicy to explicitly allow egress traffic.

Apply the Deployment Manifest

kubectl apply -f - <<EOF apiVersion: apps/v1 kind: Deployment metadata: name: test-connectivity labels: app: test-connectivity spec: replicas: 1 selector: matchLabels: app: test-connectivity template: metadata: labels: app: test-connectivity spec: containers: - name: ping-container image: alpine:latest # Lightweight image with ping available command: ["/bin/sh", "-c"] args: - while true; do echo "Pinging google.com..."; ping -c 5 google.com; echo "Sleeping for 30 seconds..."; sleep 30; done; EOF
yaml

Validate the Availability of Metrics Using Grafana

  • Open Grafana and navigate to the Explore section.
  • Search for metrics with prefixes like hubble_, cilium_, or envoy_.
  • Validate the metrics relevant to your use case and create dashboards accordingly.

Create Dashboards to Monitor Traffic

  • Pod A to Pod B Communication

    sum by(source_app, destination_app, destination_namespace, source_namespace) (rate(hubble_flows_processed_total{source_app="pod-a", destination_app="pod-b", source_namespace="default", destination_namespace="default"}[$__rate_interval]))
    promql
  • Pod B to Pod A Communication

    sum by(source_app, destination_app, destination_namespace, source_namespace) (rate(hubble_flows_processed_total{source_app="pod-b", destination_app="pod-a"}[$__rate_interval]))
    promql
  • Pod to External World Communication

    sum by(source_app) (rate(hubble_flows_to_world_total{source_app="test-connectivity", source_namespace="default"}[$__rate_interval]))
    promql
  • External World to Pod Communication

    sum by(destination_app) (rate(hubble_flows_processed_total{destination_namespace="default", destination_app="http-echo"}[$__rate_interval]))
    promql
    Dashboard
    Dashboard

Alternatives

1. Calico

Pricing Model: Open-source with optional commercial support.

  • Pros:
    • Provides load balancing and in-kernel security enforcement.
    • Widely used in Kubernetes environments.
  • Cons:
    • Primarily focused on networking rather than comprehensive observability.
  • License Model: Open-source licensing with commercial options.

2. Pixie

  • Pricing Model: Open-source with potential commercial features for enterprise users.
  • Pros:
    • Automatically captures telemetry data without manual instrumentation.
    • Provides high-level service maps and detailed application traffic views.
  • Cons:
    • May not offer as much low-level network detail as other tools.
  • License Model: Open-source with optional enterprise features.

3. Kubeshark

Pricing Model: Open-source community edition with proprietary enterprise options.

  • Pros: Provides real-time protocol-level visibility into Kubernetes traffic.
  • Cons: Primarily focused on Kubernetes; may not be suitable for non-containerized applications.
  • License Model: Open-sourcewith enterprise licensing for advanced features.

4. OpenTelemetry Network (OpenTelemetry Network Monitoring)

  • Pricing Model: Open-source, supported by a vibrant community and cloud providers.
  • Pros:
    • Unified framework for collecting, processing, and exporting metrics, logs, and traces.
    • Provides visibility into network traffic as part of broader observability.
    • Vendor-neutral, integrates seamlessly with Prometheus, Grafana, and Jaeger.
  • Cons:
    • Requires careful setup to capture eBPF-based network telemetry.
    • May not offer as detailed protocol-level traffic analysis compared to dedicated tools like Kubeshark.
  • License Model: Open-source under the CNCF (Cloud Native Computing Foundation).

Conclusion

Due to increasing threats and various compliance requirements, we should monitor our east-west and north-south traffic in the Kubernetes clusters. eBPF-based tools like Cilium Hubble are revolutionizing network observability in Kubernetes, offering deep visibility into traffic, service dependencies, and security. In the cloud-native era, adopting eBPF-based solutions is essential for managing the complexity of Kubernetes while ensuring security and performance.

Enhance Your Kubernetes Observability with eBPF

Discover how eBPF-based solutions can enhance your Kubernetes environment and security—get started with the right tool for your needs.

Enjoying this post?

Get our posts directly in your inbox.