DevOps Tools Introduction #03: Cloud Components and Platforms

2026 Februar 03 - By Fabian Thorns and Uirá Ribeiro

Previous installments in this series described modern software development and got us thinking about architectures for that software. Equipped with this knowledge, we will focus on the building blocks of such applications. Objective 701.2 of LPI’s DevOps Tool Engineer exam covers software components and platforms. The objective mentions a number of technology components such as object stores, relational and NoSQL databases, message brokers, and big data services. Everyone working in IT for some time will likely know at least one tool in each of these categories. Likewise, a general understanding about the roles of these components in a software architecture is essential.

While these application components used to be installed manually by a system administrator when preparing the installation of a software, nowadays cloud providers offer the instantaneous provisioning of these tools as a service. Most cloud providers employ existing technologies, integrate them into provider interfaces, and use distinct names to refer to the resulting products.

Cloud Service Models

Cloud computing embraces several different levels of services, each level dividing responsibilities differently between the provider and the user. The main models are IaaS, PaaS, SaaS, and FaaS, which are located on a spectrum from maximum control to maximum convenience.

The Infrastructure as a Service (IaaS) model provides virtualized computing resources over the internet, representing the most fundamental level of cloud service. The user gains access to virtual servers, storage, and networking, while retaining responsibility for the operating system, middleware, runtime, and applications. OpenStack exemplifies an open-source IaaS platform, with components like Nova (compute), Cinder (block storage), Swift (object storage), and Neutron (networking) providing the complete infrastructure.

The Platform as a Service (PaaS) model abstracts the underlying infrastructure, providing a complete environment for developing, deploying, and managing applications. The provider manages the operating system, middleware, and runtime, allowing developers to focus exclusively on the application code. Examples include Heroku, Google App Engine, and, in the OpenStack context, the Trove project for Database as a Service.

The Software as a Service (SaaS) model represents the highest level of service, providing complete, ready-to-use software over the internet. Users access applications without concerns about installation, maintenance, or infrastructure. The provider manages all technical aspects, from hardware to software updates. Examples include Google Workspace, Microsoft 365, and Salesforce.

The Function as a Service (FaaS) model, often associated with the term serverless computing, allows code to be executed in response to events without managing any infrastructure. The code runs in ephemeral containers that are initiated on demand, with billing based only on the actual execution time. This model is ideal for event-driven workloads, webhook processing, and microservices with variable demand. Examples include AWS Lambda, Google Cloud Functions, and Azure Functions.

Object Storage: Features and Concepts

Object Storage is a data storage architecture that manages data as objects, as opposed to other storage architectures. Unlike traditional file systems, which organize data in a directory hierarchy, or block storage, which manages data as blocks within sectors and tracks, object storage treats each data item as an autonomous unit. Each object typically includes the data itself, a variable amount of custom metadata, and a globally unique identifier. Amazon S3 (Simple Storage Service) has established itself as the most influential object storage service on the market, and its API has become a de facto standard widely adopted by other implementations.

Relational and NoSQL Databases

Databases are a fundamental component of almost all modern software applications. They can be categorized into two major groups: relational (SQL) and non-relational (NoSQL), each with specific characteristics, advantages, and use cases.

Relational databases store data in a structured tabular format, organizing information into rows and columns with well-defined relationships between tables. They use a predefined schema to structure the data and enforce referential integrity, ensuring consistency through ACID (Atomicity, Consistency, Isolation, and Durability) properties. MySQL has established itself as one of the most popular open-source relational databases globally, widely used in web applications. MariaDB emerged as a fork of MySQL, maintaining full compatibility while adding advanced features and remaining completely open source. PostgreSQL stands out as an open-source object-relational database system, recognized for its reliability, feature robustness, and compliance with SQL standards.

NoSQL databases provide mechanisms for storing and retrieving data modeled differently from the traditional tabular relations. Many are often chosen for big data applications and real-time systems where schema flexibility and horizontal scalability are priorities. Redis functions as an in-memory data structure store, serving as a database, cache, and message broker. Its key-value architecture provides extremely low latency for read and write operations. MongoDB adopts a document model, storing data in a JSON-like format (BSON) that offers flexible schemas and native horizontal scalability. InfluxDB specializes in time-series data, optimized for storing and querying metrics, events, and measurements with timestamps; it is particularly useful in monitoring and IoT scenarios.

Message Brokers and Message Queues

In distributed system architectures, asynchronous communication is a fundamental pattern for decoupling services, improving scalability, and increasing system resilience. Message queues and message brokers are the central components that enable this communication. A message queue consists of a data structure that temporarily stores messages until they are processed by their recipients (called consumers). It typically ensures that each message is processed only once, generally following the order of arrival (FIFO – First-In, First-Out), although specific implementations may offer different semantics.

Apache Kafka represents a distributed event streaming platform, designed to handle trillions of events daily. Kafka is used to build real-time data pipelines and streaming applications, combining messaging, storage, and stream processing into a single platform. MQTT (Message Queuing Telemetry Transport) is a lightweight messaging protocol based on the publish/subscribe pattern, specifically designed for connections in environments with limited bandwidth or unstable connections.

Big Data Services

Big Data refers to datasets characterized by the „3 Vs“: Volume (massive quantity), Velocity (high rate of generation and processing), and Variety (diversity of formats and sources). Big Data services provide the tools and infrastructure to store, process, and analyze these volumes of data that exceed the capacity of traditional tools.

Elasticsearch has established itself as a distributed search and analytics engine built on Apache Lucene. It allows for storing, searching, and analyzing large volumes of data in near real-time, being widely used for full-text search, log analysis, infrastructure monitoring, and security analysis (SIEM). OpenSearch emerged as a fork of Elasticsearch, created by Amazon Web Services after changes in the original project’s license. Maintaining compatibility with the Elasticsearch API, OpenSearch is developed under the Apache 2.0 license, ensuring that it remains completely open-source. Both tools offer similar functionalities, including distributed indexing, real-time search, and visualization through integrated dashboards.

Content Delivery Networks (CDNs)

A Content Delivery Network (CDN) consists of a geographically distributed network of servers that collaborate to provide fast delivery of internet content. The fundamental principle is to place copies of the content at multiple points of presence (PoPs) around the world, reducing the physical distance between the user and the server handling their request. CDNs enable the rapid transfer of assets needed to load web content, including HTML pages, JavaScript files, CSS stylesheets, images, and videos. In addition to reducing latency, CDNs offer additional benefits such as DDoS protection, SSL/TLS termination at edge servers, content compression, and image optimization.

A CDN operates by caching static content on edge servers, which respond to requests from the nearest users. When a user requests a resource, the CDN directs the request to the geographically closest edge server. If the content is cached, it is served immediately; otherwise, the edge server fetches the content from the origin server, caches it, and delivers it to the user.

Identity and Access Management (IAM)

Identity and Access Management (IAM) represents a framework of policies, processes, and technologies that ensures that the right entities (people, services, devices) have the appropriate access to technological resources. In cloud environments, IAM is a critical security component, controlling who can do what on which resources. The fundamental concepts of IAM include identities (representations of users, services, or devices), authentication (verifying that an entity is who it claims to be), authorization (determining what an authenticated entity can do), and auditing (recording actions for compliance and investigation).

In next week’s article, we’ll move forward and dive into Source Code Management, exploring how to efficiently manage, version, and share source code in modern development workflows.

<< Read the previous part of this series | Read the next part of this series >>

Authors

Fabian Thorns

Fabian Thorns is the Director of Product Development at Linux Professional Institute, LPI. He is M.Sc. Business Information Systems, a regular speaker at open source events and the author of numerous articles and books. Fabian has been part of the exam development team since 2010. Connect with him on LinkedIn, XING or via email (fthorns at www.lpi.org).

Uirá Ribeiro

Uirá Ribeiro is a distinguished leader in the IT and Linux communities, recognized for his vast expertise and impactful contributions spanning over two decades. As the Chair of the Board at the Linux Professional Institute (LPI), Uirá has helped shaping the global landscape of Linux certification and education. His robust academic background in computer science, with a focus on distributed systems, parallel computing, and cloud computing, gives him a deep technical understanding of Linux and free and open source software (FOSS). As a professor, Uirá is dedicated to mentoring IT professionals, guiding them toward LPI certification through his widely respected books and courses. Beyond his academic and writing achievements, Uirá is an active contributor to the free software movement, frequently participating in conferences, workshops, and events organized by key organizations such as the Free Software Foundation and the Linux Foundation. He is also the CEO and founder of Linux Certification Edutech, where he has been teaching online Linux courses for 20 years, further cementing his legacy as an educator and advocate for open-source technologies.

Schreibe einen Kommentar Antwort abbrechen

Linux Professional Institute ist eine gemeinnützige Organisation.

Linux Professional Institute (LPI) ist eine globale Organisation für Zertifizierungsstandards und zur Karriereplanung für Open-Source-Profis. Mit mehr als 350.000 Zertifikatsinhabern ist es die weltweit erste und größte herstellerneutrale Linux- und Open-Source-Zertifizierungsstelle. LPI verfügt über zertifizierte Fachleute in über 180 Ländern, bietet Prüfungen in mehreren Sprachen an und hat Hunderte von Trainingspartnern.

Unsere Mission ist es, die Nutzung von Open Source zu fördern, indem wir die Menschen unterstützen, die damit arbeiten.