
While running, most applications, containers, and virtual machines constantly generate information about numerous events. These events can be anything from severe errors to a simple notice that the server successfully answered a request. Collecting and analysing this log data becomes challenging in a multi-tieredarchitecture or dynamic microservice environment. The DevOps Tools Engineer exam covers log management and analysis in objective 704.3.
Logging is a foundational capability in modern computing systems, acting as the primary mechanism for observing behavior, diagnosing failures and attacks, auditing activity, and ensuring compliance. At its core, logging is the structured recording of events generated by applications, operating systems, and infrastructure components. These records—commonly referred to as logs—capture contextual information about what happened, when it happened, and often why it happened.
Application logging focuses on events generated within the logic of software systems. These events may include user actions, errors, state transitions, API calls, and performance metrics. Logs are typically emitted through logging libraries integrated into the application code, such as log4j, logback, or language-native frameworks. Each log entry is often structured with fields such as timestamp, severity level (e.g., INFO, WARN, ERROR), message, and contextual metadata.
System logging, on the other hand, captures events generated by the operating system and its components. This includes kernel messages, service lifecycle events, authentication attempts, and hardware-related notifications. System logs are essential for understanding the health and behavior of the host environment and are typically managed by system-level logging services.
The lifecycle of a log entry follows a pipeline that can be broadly divided into four stages: generation, collection, processing, and visualization.
In the generation phase, logs are produced by applications or system components. These logs may be written to files, standard output (stdout), or system logging sockets.
During collection, log agents or forwarders—such as Filebeat or Fluent Bit—monitor log sources and ship data to centralized systems. These agents are lightweight and designed to operate efficiently across distributed environments.
Processing involves parsing, filtering, transforming, and enriching log data. Tools like Logstash or Fluentd apply pipelines to normalize log formats, extract fields, and prepare the data for indexing.
Finally, logs are stored and indexed in systems such as Elasticsearch or OpenSearch, enabling fast search and analytics. Visualization tools like Kibana or Grafana provide interfaces for querying and exploring log data.
This pipeline enables organizations to move from raw, unstructured log entries to actionable insights.
The Elastic Stack, which includes the combination of tools Logstash, Elasticsearch, and Kibana, is used on the LPI exam as a reference implementation. From these tools, Logstash is the component that usually requires the most configuration and is the central focus of this objective.
Elasticsearch is a distributed search and analytics engine that stores logs as indexed documents, enabling full-text search and aggregation queries. OpenSearch, a fork of Elasticsearch, provides similar capabilities with an open governance model.
Logstash is a data processing pipeline that ingests logs from multiple sources, applies filters, and outputs event information to storage. It supports a wide range of plugins for parsing and transformation.
Fortunately, the Logstash documentation is quite comprehensive. You should start with the first chapters, Logstash Introduction and Getting Started with Logstash. How Logstash Works summarizes the main elements of a Logstash pipeline.
Equipped with this knowledge, set up your first Elastic Stack. stack-docker provides a Docker Compose file that sets up the components of the Elastic Stack — and much more. Use this file both to gain more Docker experience and to set up Logstash, Elasticsearch, Kibana and, later, Filebeat. Alternatively, follow Sébastien Pujadas’ elk-docker guide for setting up the Elastic Stack Docker.
Now that you have a playground, give a closer look to the Configure logstash guide. Follow all the subchapters, as they cover important topics mentioned in the objectives.
Filebeat is designed for simplicity and efficiency, focusing on log shipping rather than transformation. It reads log files line by line and forwards them with minimal overhead.
The Filebeat documentation provides an overview of Filebeat along with the recommended getting started guide. The Logstash documentation describes Filebeat’s counterpart, the Beats input plugin.
An alternative architecture is based on the Fluentd ecosystem. Fluentd acts as a unified logging layer, capable of collecting, processing, and forwarding logs. It supports structured logging and integrates with numerous backends.
Fluent Bit is a lightweight version of Fluentd, optimized for edge environments and containerized workloads. It is commonly used in Kubernetes clusters to collect logs from containers.
Another increasingly popular stack is based on Loki, developed by Grafana Labs. Unlike Elasticsearch, Loki is designed to index only metadata (labels) rather than full log content, significantly reducing storage and indexing costs.
Promtail is the log collection agent used with Loki. It scrapes logs from files or containers and attaches labels before sending them to Loki.
LPI also expects you to use syslog to ship log data to Logstash. In case you’re not familiar with syslog, Aaron Leskiw’s introduction to syslog is a good place to start. You might also want to review the manpage of syslog.conf(5). To turn Logstash into a syslog server, the Syslog input should be configured.
In addition to the Beats and Syslog input plugins, Logstash’s functionality can be extended through the use of numerous input, output and filter plugins. Browse through these indexes to learn more about the modules that are related to the technologies covered in the DevOps Tools Engineer exam.
Elasticsearch is responsible for storing the log data. While this sounds unspectacular, indexes and data retention should be configured within Elasticsearch to support the analysis of log data. The Elasticsearch documentation’s getting started guide gives you an initial overview of Elasticsearch itself. Afterwards, learn more about indexes and retiring data in Elasticsearch.
Once data is stored in Elasticsearch, Kibana provides a graphical way to access, aggregate and explore the logged information. The Kibana documentation explains how to interactively explore data, how to use visualization tools and how to create dashboards.
Kibana provides a rich interface for exploring logs stored in Elasticsearch or OpenSearch. It supports dashboards, visualizations, and query languages such as KQL.
Grafana is a versatile visualization platform that supports multiple data sources, including Loki, Elasticsearch, and time-series databases like Prometheus. It enables the creation of unified dashboards that combine logs, metrics, and traces.
Graylog2 (commonly referred to as Graylog) is another centralized logging platform that integrates collection, processing, and visualization. It uses Elasticsearch or OpenSearch as a backend and provides a user-friendly interface for managing log data, including alerting and stream-based routing.
Understanding how application and system logging works—and how modern logging stacks are architected—is essential for operating reliable and observable systems. From traditional syslog daemons to advanced distributed platforms like the Elastic Stack, Fluentd, and Loki, logging has evolved into a critical pillar of modern infrastructure.
Next week, we move on to the final objective of the LPI DevOps Tools v2.0 Engineer exam, where we will explore tracing and OpenTelemetry.
You’ve come a long way—keep building, keep experimenting, and keep sharpening your skills. And remember: you can deepen your preparation using the official, free Learning Materials provided by LPI.
<< Read the previous article of this series | Start the series from the beginning >>
Vous êtes actuellement en train de consulter le contenu d'un espace réservé de Vimeo. Pour accéder au contenu réel, cliquez sur le bouton ci-dessous. Veuillez noter que ce faisant, des données seront partagées avec des providers tiers.
Plus d'informationsVous êtes actuellement en train de consulter le contenu d'un espace réservé de YouTube. Pour accéder au contenu réel, cliquez sur le bouton ci-dessous. Veuillez noter que ce faisant, des données seront partagées avec des providers tiers.
Plus d'informationsVous devez charger le contenu de reCAPTCHA pour soumettre le formulaire. Veuillez noter que ce faisant, des données seront partagées avec des providers tiers.
Plus d'informationsVous devez charger le contenu de reCAPTCHA pour soumettre le formulaire. Veuillez noter que ce faisant, des données seront partagées avec des providers tiers.
Plus d'informations