How to Implement OpenTelemetry Auto Instrumentation for Effortless Observability

Introduction

Observability is critical in modern cloud-native systems for tracking application health, resolving problems, and improving performance. Previously, the source code would have to be manually modified in order to instrument applications using SDKs to collect telemetry data such as metrics, and traces. However, we can achieve this level of visibility with no code changes, thanks to OpenTelemetry's auto-instrumentation capabilities.

In this article, we will be looking into how to instrument applications with minimum effort using OpenTelemetry auto-instrumentation. Specifically, we will deploy a simple Node.js application that demonstrates the capabilities of OpenTelemetry by showcasing metrics and traces. By following a setup process, we will install the necessary OpenTelemetry packages, configure the application for auto-instrumentation, and deploy it within a Kubernetes environment. Once deployed, we will be able to make requests to the application and observe the collected metrics and traces in real-time, providing valuable insights into its performance and behavior. This hands-on approach will illustrate the ease of integrating observability into applications without extensive manual coding efforts.

Foundation

A quick detour to OpenTelemetry and eBPF will help us understand how auto-instrumentation is possible.

OpenTelemetry

OpenTelemetry is a vendor-neutral, open-source standard for collecting telemetry data such as traces, metrics, and logs. It provides an extensive set of APIs, libraries, agents, and instrumentation for distributed tracing and monitoring. It is used in a wide range of applications including cloud-native apps, microservice architectures, and distributed systems. It is widely utilized across technologies and programming languages, making it an effective solution for observability. OpenTelemetry is used for monitoring and tracing complex application activity and interactions in Kubernetes clusters, serverless apps, and other environments.

Struggling with fragmented observability?

Let our OpenTelemetry experts streamline your metrics, traces, and logs into a unified solution. Gain deeper visibility, faster troubleshooting, and reliable system insights.

eBPF

Extended Berkeley Packet Filter, or eBPF, is a powerful and efficient method that allows the running of custom code inside an operating system's kernel without modifying the kernel itself. eBPF was initially designed for network packet filtering, but it has since grown to support a variety of use cases, such as security, tracing, and performance monitoring. Small, sandboxed programs can be attached to different locations inside the kernel or user space by developers, providing for deep observability and real-time control over system behavior while minimizing overhead and system crash risk.

Auto-Instrumentation

Auto-instrumentation refers to the ability to automatically add tracing or metrics to your application without changing the code. OpenTelemetry provides this feature for many popular programming languages, including Java, Python, Go, and Node.js.

How does it work?

To implement auto-instrumentation, OpenTelemetry deploys an agent that connects to the application runtime. The agent tracks and gathers telemetry information from libraries and frameworks using a variety of methods, including low-level hooks, dynamic proxies, and bytecode manipulation. To implement OpenTelemetry auto-instrumentation, there are two primary deployment methods: using the Kubernetes Operator and Zero-Code Instrumentation. Both methods provide efficient ways to capture telemetry data without requiring significant changes to application code.

Kubernetes Operator: The OpenTelemetry Operator for Kubernetes simplifies the process of injecting auto-instrumentation into applications running in a Kubernetes environment. It works by adding an init container to the application's pod, which injects the necessary libraries and configurations for auto-instrumentation. This approach supports multiple languages, including .NET, Java, Node.js, Python, and Go.
Zero-Code Instrumentation: Zero-code instrumentation allows applications to be instrumented without modifying their source code by attaching an agent at runtime. For example in:
1. Javascript: The @opentelemetry/api and @opentelemetry/auto-instrumentations-node packages install the API, SDK, and the instrumentation tools.
2. Python: The opentelemetry-distro package installs the API, SDK, and the opentelemetry-bootstrap and opentelemetry-instrument tools.

The operation of OpenTelemetry Zero-Code Instrumentation is as follows:

Setting up the OpenTelemetry Agent or SDK: OpenTelemetry offers agents for the majority of languages that can be installed alongside the application without requiring modifications to the source code.
Automatic Hooks: The agent automatically finds and equips the frameworks or libraries that your application calls for, such as database connectors and HTTP libraries.
Telemetry Collection: After deployment, the agent automatically gathers traces and metrics to monitor resource usage, response times, and requests.

Key Benefits of OpenTelemetry Auto Instrumentation

There are numerous benefits such as:

Zero Code Involvement: One of the most significant advantages of auto-instrumentation is that it requires zero changes to the application code. The instrumentation is achieved entirely through configuration and external agents, allowing teams to enable observability without modifying existing systems.
Reduced time: Developers need to offer specialized code for logging, tracing, and metrics collecting when using traditional instrumentation. This requirement is removed by auto-instrumentation, which enables teams to get started much faster.
Centralized Monitoring: With OpenTelemetry, you can consolidate telemetry data from different services into a centralized monitoring platform. This makes it easier to track the health of an entire distributed system, especially in complex microservice architectures.
Multiple Language Support: OpenTelemetry provides auto-instrumentation for multiple programming languages, including Java, Python, Go, .NET, etc. This means that you can instrument applications written in various languages without needing to write custom instrumentation for each one.

Limitations of OpenTelemetry Auto Instrumentation

Dependency on OpenTelemetry Agent: Since auto-instrumentation relies on the OpenTelemetry agent, the monitoring and observability depend on the correct configuration and operation of the agent. Any issues with the agent (e.g., resource constraints, and configuration errors) may impact telemetry collection.
Limited Coverage: While OpenTelemetry provides a wide range of integrations, it may not cover every library, language, or framework used in your application. Some custom or less common libraries might require manual instrumentation.
Few Language Support: Although OpenTelemetry supports several programming languages, not all languages may have full support for auto-instrumentation. Some languages might only support partial instrumentation or require additional configuration. Examples include: Ruby, C/C++, and Elixir, etc are a few programming languages currently not supported for auto-instrumentation.

Demo of OpenTelemetry Auto-instrumentation

This section provides a step-by-step guide to setting up a demo environment for Open Telemetry Auto-instrumentation in NodeJS.

Architecture

In this demo, we will deploy a simple to-do application written in Node.js to our Kubernetes cluster. This application allows users to manage their activities interactively, providing a simple approach for showing the setup. To improve observability and monitor the application's behavior, we will integrate Grafana Cloud for log and metric visualization. Grafana will help us monitor the application's health and performance in real-time, helping us identify potential difficulties and gaining insights into its operations. Once everything is set up, we will validate the integration by visualizing the logs and metrics in Grafana, providing a comprehensive view of the application's performance.

Application Architecture

Kubernetes Architecture

App deployment Architeecture — App deployment Architecture

Prerequisites

Kubernetes Cluster: We are using kind but you can use any Kubernetes cluster. Install Kind and create a cluster with the following command:

kind create cluster
bash

Cert-manager: In order to install the OpenTelemetry Operator we need to install cert-manager.

kubectl apply -f [https://github.com/cert-manager/cert-manager/releases/download/v1.16.3/cert-manAger.yaml](https://github.com/cert-manager/cert-manager/releases/download/v1.16.3/cert-manAger.yaml)
bash

OpenTelemetry Operator

kubectl apply -f [https://github.com/open-telemetry/opentelemetry-operator/releases/latest/download/opentelemetry-operator.yaml](https://github.com/open-telemetry/opentelemetry-operator/releases/latest/download/opentelemetry-operator.yaml)
bash

Application Setup

Clone this repo https://github.com/cloudraftio/nodejs-otel-auto-instrumentation. Then navigate to the deployment section of the repo. There are three yaml configurations named deployment.yaml, collector.yaml and instrumentation.yaml file.

Deployment.yaml: This configuration file consists the information about our application deployment. It includes the following resources: Deployment, Services, ConfigMaps, Secrets and a Namespace. Here we see annotations are added to the existing deployments; this is required to make sure the instrumentation is correctly configured for OpenTelemetry auto-instrumentation.

instrumentation.opentelemetry.io/inject-nodejs: "true"
instrumentation.opentelemetry.io/nodejs-container-names: todo-app
bash

Telemetry data such as traces and metrics without the need for extensive manual coding. These annotations help the OpenTelemetry agent to automatically detect and instrument the application, ensuring that performance monitoring and observability are efficiently integrated into the deployment process.

Collector.yaml: This is the OpenTelemetry Collector configurations which collects the data from the Instrumentation and send it to the backend (here Grafana Cloud).
Instrumentation.yaml: This is the OpenTelemetry Instrumentation configuration which runs as a init-container in the applications Pod. It identifies the pod where the required auto-instrumentations annotations are configured.

Navigate to the collector.yaml file and edit with your Grafana Cloud OTLP Authentication.

Go on to your Grafana Cloud OTLP Configuration: https://grafana.com/orgs/your-organization/stacks/your-instance/otlp-info
Set the endpoint in the collector.yaml to OTLP Endpoint mentioned in your Grafana Cloud
Generate a Password/API token to access your Grafana Instance and encode the username:password in base64 encoding. Example the

InstanceID: 1105050
API/Password: gcs_kTKUeyQkWefHQEot5e8RvWsLDVF23CH3lF0wjpBy7eYq4QDMFs #this is not a real password
yaml

Copy the encoded value and paste it into the Authorization section of the collector.yaml.

echo -n '1105050:gcs_kTKUeyQkWefHQEot5e8RvWsLDVF23CH3lF0wjpBy7eYq4QDMFs' | base64
Result: MTEwNTA1MDpnY3Nfa1RLVWV5UWtXZWZIUUVvdDVlOFJ2V3NMRFZGMjNDSDNsRjB3anBCeTdlWXE0UURNRnM=
bash

Enabling Autoinstrumentation

When the pod starts up, the annotation tells the Operator to look for an Instrumentation object in the pod’s namespace, and to inject auto-instrumentation into the pod. It adds an init-container to the application’s pod, called opentelemetry-auto-instrumentation, which is then used to injects the auto-instrumentation into the app container. If the Instrumentation resource isn’t present by the time the application is deployed, however, the init-container can’t be created. Therefore, if the application is deployed before deploying the Instrumentation resource, the auto-instrumentation will fail.

kubectl apply -f deployment.yaml
kubectl apply -f instrumentation.yaml
kubectl apply -f collector.yaml
bash

Delete the todo-app-<> pod inorder to start the init-container.

Making some sample requests to the web services

kubectl port-forward svc/todo-app-service 3000:3000 -n otel-auto
bash

Visualizing Metrics

Now navigate to your Grafana Cloud Default Metrics section. Here you can see various metrics: http_server_duration_millisecond_count, http_client_duration_millisecond_count, etc.

We have created a sample dashboard to visualize your metrics. This repository contains the panel.json file which you can import in your dashboards and use it. Now here you can look for all the metrics with zero-code instrumentation of Open Telemetry aka Auto-Instrumentation.

HTTP Requests: Requested by the user to the application

HTTP Status Code: Status Code served during request

Routes: routes accessed during interaction with the server

Let’s see a few traces generated by the instrumentation

Detailed view of the traces for GET request

Detailed information about the traces for POST request

Conclusion

OpenTelemetry stands as a leading standard in observability. Adopting OpenTelemetry’s automatic instrumentation in projects offers many advantages, including uniform observability across distributed systems. This facilitates troubleshooting and performance monitoring without the necessity for code modifications. By reducing manual effort, auto instrumentation expedites setup and ensures reliable data collection.

Observability's primary benefits lie in its ability to improve system reliability, expedite incident resolution, optimize performance, minimize downtime, and reduce operational costs. Simultaneously, it fosters increased developer productivity. Consequently, organizations can realize a compelling return on investment by leveraging OpenTelemetry’s capabilities to implement Observability. This investment leads to enhanced application performance, cost-effectiveness, and business value.

Enhance your application observability with modern tools and techniques.