eBPF (Extended Berkeley Packet Filter) is a revolutionary Linux kernel technology that allows safe, efficient, and dynamic execution of custom programs directly within the kernel. By leveraging eBPF, modern observability tools can monitor and analyze system behavior without the overhead or risks associated with traditional instrumentation methods.
Why eBPF?
- Low Overhead: eBPF runs directly in the kernel, minimizing the performance impact on the system while providing high-fidelity insights.
- Dynamic Instrumentation: Modify monitoring and tracing logic without rebooting or recompiling the kernel.
- Deep Visibility: Monitor system calls, network traffic, and process behavior in real time.
This technology has become the cornerstone of cloud-native observability and security, offering unparalleled insights into workloads and infrastructure. Cilium Hubble builds upon eBPF to deliver an exceptional observability solution tailored for Kubernetes environments.
What is Cilium?
Cilium is open source software for transparently securing the network connectivity between application services deployed using Linux container management platforms like Docker and Kubernetes.
At the foundation of Cilium is a new Linux kernel technology called eBPF, which enables the dynamic insertion of powerful security visibility and control logic within Linux itself. Because eBPF runs inside the Linux kernel, Cilium security policies can be applied and updated without any changes to the application code or container configuration.
What is Hubble ?
Hubble is a fully distributed networking and security observability platform. It is built on top of Cilium and eBPF to enable deep visibility into the communication and behavior of services as well as the networking infrastructure in a completely transparent manner.
By building on top of Cilium, Hubble can leverage eBPF for visibility. By relying on eBPF, all visibility is programmable and allows for a dynamic approach that minimizes overhead while providing deep and detailed visibility as required by users. Hubble has been created and specifically designed to make best use of these new eBPF powers.
Component Overview
Cilium Components
Agent
The Cilium agent (cilium-agent
) operates on every node in the Kubernetes cluster. It manages network configurations, service load-balancing, network policies, and monitoring by processing input from Kubernetes or APIs. This agent monitors orchestration systems like Kubernetes for container and workload events, ensuring seamless networking setup and teardown. Additionally, it manages eBPF programs within the Linux kernel, which control all ingress and egress traffic for the containers.
CLI Client
The Cilium CLI client (cilium
) is a command-line tool installed alongside the Cilium agent. It interacts with the local agent's REST API to provide insights into its state and functionality. The CLI also allows direct access to eBPF maps, enabling users to inspect and validate their configurations.
Operator
The Cilium Operator manages cluster-wide tasks, ensuring that certain responsibilities are handled centrally rather than at the node level. Although not involved in the critical paths for forwarding or enforcing network policies, the operator is essential for functions like IP address management (IPAM). Temporary unavailability of the operator may lead to:
- Delayed allocation of IP addresses, which can postpone workload scheduling.
- Missed updates to the kvstore heartbeat key, causing agents to perceive the kvstore as unhealthy and restart.
CNI Plugin
The CNI plugin (cilium-cni
) is triggered by Kubernetes whenever a pod is scheduled or terminated on a node. It communicates with the node's Cilium API to configure networking, load-balancing, and network policies required for the pod's operation.
Hubble Components
Server
The Hubble server, integrated into the Cilium agent, collects visibility data using eBPF. This tight integration ensures high performance and low overhead. It provides gRPC services for accessing flow events and supports Prometheus monitoring for metrics collection and visualization..
Relay
Hubble Relay (hubble-relay
) is a standalone component designed for cluster-wide observability. It connects to the gRPC APIs of all Hubble servers in the cluster, aggregating their data into a unified API for comprehensive visibility.
CLI Client
The Hubble CLI (hubble
) connects to either a local Hubble server or the Hubble Relay via gRPC to fetch flow data and events.
Graphical UI
The Hubble UI (hubble-ui
) leverages the relay's visibility features to present an intuitive graphical interface. It provides service dependency diagrams and connectivity maps to help visualize the cluster's network flow.
Key Features
-
Real-Time Network Observability
- Visualize and monitor network flows, service dependencies, and traffic patterns.
- Detect issues like packet drops, flow latency, or misconfigured network policies immediately.
-
Kubernetes-Native Insights
- Gain observability using Kubernetes-native identities like pods, namespaces, and labels.
- Understand inter-service communication patterns with a focus on microservices architecture.
-
Application Layer Visibility
- Observe DNS queries, HTTP traffic, and TCP flows for enhanced debugging.
- Enforce Layer 7 policies with visibility into application-layer data.
-
Seamless Integration with Monitoring Tools
- Export metrics to Prometheus and visualize them in Grafana dashboards.
- Use Hubble’s UI for an intuitive, graphical representation of network traffic.
-
Security Policy Monitoring
- Ensure compliance and security by monitoring policy enforcement.
- Use identity-based security to secure workloads dynamically.
Challenges and Limitations
While Cilium Hubble offers cutting-edge features, there are certain challenges to consider:
-
Technical Expertise
- Setting up and managing Cilium Hubble requires knowledge of Kubernetes, eBPF, and networking concepts.
-
Kubernetes Dependency
- Cilium Hubble is designed specifically for Kubernetes environments, limiting its use in non-containerized workloads.
Hands-On Demo: Setting Up Cilium Hubble for Network Observability
Prerequisites
- Kind CLI: To create a local Kubernetes cluster.
- Helm: For installing Cilium and Prometheus/Grafana.
- kubectl: To interact with the cluster.
Step 1: Setting Up the Kind Cluster
-
Install Kind CLI if not already installed.
-
Create a file named
kind.yaml
with the following content:kind: Cluster apiVersion: kind.x-k8s.io/v1alpha4 nodes: - role: control-plane image: kindest/node:v1.32.0@sha256:c48c62eac5da28cdadcf560d1d8616cfa6783b58f0d94cf63ad1bf49600cb027 - role: worker image: kindest/node:v1.32.0@sha256:c48c62eac5da28cdadcf560d1d8616cfa6783b58f0d94cf63ad1bf49600cb027 - role: worker image: kindest/node:v1.32.0@sha256:c48c62eac5da28cdadcf560d1d8616cfa6783b58f0d94cf63ad1bf49600cb027yaml -
Create the cluster:
kind create cluster --config kind.yaml --name kind-clusterbash
Your Kind cluster with two worker nodes and one control-plane node is now ready.
Step 2: Installing Prometheus and Grafana
-
Add the Prometheus Helm repository:
helm repo add prometheus-community https://prometheus-community.github.io/helm-charts helm repo updatebash -
Install the Prometheus stack:
helm install kube-prometheus-stack prometheus-community/kube-prometheus-stack --namespace monitoring --create-namespacebash -
Access Grafana:
kubectl port-forward svc/kube-prometheus-stack-grafana 3000:80bashOpen your browser and navigate to
http://localhost:3000
.- Username:
admin
- Password:
prom-operator
- Username:
Step 3: Installing Cilium Hubble
-
Add and install the Cilium Helm chart:
helm install cilium cilium/cilium --version 1.16.5 \ --namespace kube-system \ --set prometheus.enabled=true \ --set operator.prometheus.enabled=true \ --set hubble.enabled=true \ --set hubble.relay.enabled=true \ --set hubble.ui.enabled=true \ --set hubble.metrics.enableOpenMetrics=true \ --set hubble.metrics.enabled="{dns,drop,tcp,flow,flow-to-world,port-distribution,icmp,httpV2:exemplars=true;labelsContext=source_ip\,source_namespace\,source_workload\,destination_ip\,destination_namespace\,destination_workload\,traffic_direction}"bash -
Customize metrics by modifying the Helm values:
helm get values cilium --namespace kube-system --output yaml > cilium.yamlbashAdd the following under
metrics.enabled
:flows-to-world:labelsContext=source_namespace,source_app,destination_namespace,destination_app flow:labelsContext=source_namespace,source_app,destination_namespace,destination_appyaml -
Reapply the changes and restart the Cilium agent:
helm upgrade cilium cilium/cilium --namespace kube-system -f cilium.yaml && \ kubectl rollout restart daemonset/cilium -n kube-system && \ kubectl rollout restart daemonset/cilium-envoy -n kube-systembash
Step 4: Configuring Prometheus to Scrape Metrics
-
Export the Prometheus Helm chart values:
helm show values prometheus-community/kube-prometheus-stack -n monitoring > values.yamlbash -
Add the following scrape configurations under
additionalScrapeConfigs
invalues.yaml
:- job_name: 'kubernetes-pods' kubernetes_sd_configs: - role: pod relabel_configs: - source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_scrape] action: keep regex: true - source_labels: [__address__, __meta_kubernetes_pod_annotation_prometheus_io_port] action: replace regex: ([^:]+)(?::\d+)?;(\d+) replacement: ${1}:${2} target_label: __address__ - job_name: 'kubernetes-endpoints' scrape_interval: 30s kubernetes_sd_configs: - role: endpoints relabel_configs: - source_labels: [__meta_kubernetes_service_annotation_prometheus_io_scrape] action: keep regex: true - source_labels: [__address__, __meta_kubernetes_service_annotation_prometheus_io_port] action: replace target_label: __address__ regex: (.+)(?::\d+);(\d+) replacement: $1:$2yaml -
Apply the changes:
helm upgrade kube-prometheus-stack prometheus-community/kube-prometheus-stack -n monitoring -f values.yamlbash
Testing
To understand the power of Cilium Hubble in action, we will deploy a sample application in our Kubernetes cluster and monitor its network traffic. This practical step will demonstrate how Hubble provides real-time observability into communication patterns, network policies, and traffic flows.
Pod to Pod Communication
Deploy applications for testing Pod A
to Pod B
and Pod B
to Pod A
communication
kubectl apply -f - <<EOF
apiVersion: v1
kind: Service
metadata:
name: service-a
spec:
selector:
app: pod-a
ports:
- protocol: TCP
port: 80
targetPort: 80
---
apiVersion: v1
kind: Service
metadata:
name: service-b
spec:
selector:
app: pod-b
ports:
- protocol: TCP
port: 80
targetPort: 80
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: deployment-a
spec:
replicas: 1
selector:
matchLabels:
app: pod-a
template:
metadata:
labels:
app: pod-a
spec:
containers:
- name: pod-a-container
image: curlimages/curl
command: ["sh", "-c", "while true; do curl http://service-b:80;sleep 5; done"]
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: deployment-b
spec:
replicas: 1
selector:
matchLabels:
app: pod-b
template:
metadata:
labels:
app: pod-b
spec:
containers:
- name: pod-b-container
image: nginx
EOF
yaml
External World to Pod Communication
To enable communication from the external world to your Kubernetes pods, we utilize a LoadBalancer
service with kind's cloud-provider-kind
.
Steps:
-
Install the kind cloud provider:
go install sigs.k8s.io/cloud-provider-kind@latest sudo cloud-provider-kind startbash -
Apply the example manifest:
kubectl apply -f https://kind.sigs.k8s.io/examples/loadbalancer/usage.yamlbash -
Retrieve the external IP:
LB_IP=$(kubectl get svc/foo-service -o=jsonpath='{.status.loadBalancer.ingress[0].ip}')bash -
Test the communication:
for _ in {1..10}; do curl ${LB_IP}:5678; donebash
Pod to External World Communication
To validate that a pod can communicate with the external world, deploy a pod using a lightweight Alpine container to periodically ping
a public domain like google.com
. Additionally, configure a NetworkPolicy to explicitly allow egress traffic.
Apply the Deployment Manifest
kubectl apply -f - <<EOF
apiVersion: apps/v1
kind: Deployment
metadata:
name: test-connectivity
labels:
app: test-connectivity
spec:
replicas: 1
selector:
matchLabels:
app: test-connectivity
template:
metadata:
labels:
app: test-connectivity
spec:
containers:
- name: ping-container
image: alpine:latest # Lightweight image with ping available
command: ["/bin/sh", "-c"]
args:
- while true; do
echo "Pinging google.com...";
ping -c 5 google.com;
echo "Sleeping for 30 seconds...";
sleep 30;
done;
EOF
yaml
Validate the Availability of Metrics Using Grafana
- Open Grafana and navigate to the Explore section.
- Search for metrics with prefixes like
hubble_
,cilium_
, orenvoy_
. - Validate the metrics relevant to your use case and create dashboards accordingly.
Create Dashboards to Monitor Traffic
-
Pod A to Pod B Communication
sum by(source_app, destination_app, destination_namespace, source_namespace) (rate(hubble_flows_processed_total{source_app="pod-a", destination_app="pod-b", source_namespace="default", destination_namespace="default"}[$__rate_interval]))promql -
Pod B to Pod A Communication
sum by(source_app, destination_app, destination_namespace, source_namespace) (rate(hubble_flows_processed_total{source_app="pod-b", destination_app="pod-a"}[$__rate_interval]))promql -
Pod to External World Communication
sum by(source_app) (rate(hubble_flows_to_world_total{source_app="test-connectivity", source_namespace="default"}[$__rate_interval]))promql -
External World to Pod Communication
sum by(destination_app) (rate(hubble_flows_processed_total{destination_namespace="default", destination_app="http-echo"}[$__rate_interval]))promqlDashboard
Alternatives
1. Calico
Pricing Model: Open-source with optional commercial support.
- Pros:
- Provides load balancing and in-kernel security enforcement.
- Widely used in Kubernetes environments.
- Cons:
- Primarily focused on networking rather than comprehensive observability.
- License Model: Open-source licensing with commercial options.
2. Pixie
- Pricing Model: Open-source with potential commercial features for enterprise users.
- Pros:
- Automatically captures telemetry data without manual instrumentation.
- Provides high-level service maps and detailed application traffic views.
- Cons:
- May not offer as much low-level network detail as other tools.
- License Model: Open-source with optional enterprise features.
3. Kubeshark
Pricing Model: Open-source community edition with proprietary enterprise options.
- Pros: Provides real-time protocol-level visibility into Kubernetes traffic.
- Cons: Primarily focused on Kubernetes; may not be suitable for non-containerized applications.
- License Model: Open-sourcewith enterprise licensing for advanced features.
4. OpenTelemetry Network (OpenTelemetry Network Monitoring)
- Pricing Model: Open-source, supported by a vibrant community and cloud providers.
- Pros:
- Unified framework for collecting, processing, and exporting metrics, logs, and traces.
- Provides visibility into network traffic as part of broader observability.
- Vendor-neutral, integrates seamlessly with Prometheus, Grafana, and Jaeger.
- Cons:
- Requires careful setup to capture eBPF-based network telemetry.
- May not offer as detailed protocol-level traffic analysis compared to dedicated tools like Kubeshark.
- License Model: Open-source under the CNCF (Cloud Native Computing Foundation).
Conclusion
Due to increasing threats and various compliance requirements, we should monitor our east-west and north-south traffic in the Kubernetes clusters. eBPF-based tools like Cilium Hubble are revolutionizing network observability in Kubernetes, offering deep visibility into traffic, service dependencies, and security. In the cloud-native era, adopting eBPF-based solutions is essential for managing the complexity of Kubernetes while ensuring security and performance.
Enhance Your Kubernetes Observability with eBPF
Discover how eBPF-based solutions can enhance your Kubernetes environment and security—get started with the right tool for your needs.