By Kejia Xu (Yemo), from Alibaba Cloud Storage Team
In this article, we will discuss the technical principles of some common design patterns from a practical perspective based on the iLogtail project.
Design patterns are valuable summaries of experience in software development. However, they are often presented in an abstract and theoretical manner, which can make learning them directly quite dull for beginners or developers without much experience.
There are books or articles available in the market or on the internet that attempt to introduce design patterns in a simplified manner with practical application scenarios. However, the examples and application practices mentioned in these materials are often constructed as virtual scenarios, lacking real-world applications of production-level software. We all know the importance of applying software theory into practice, so it raises the question: Is there an opportunity to learn real production-level code?
iLogtail, developed by the Alibaba Cloud Simple Log Service (SLS) team, is a collector for observable data. It is now open-source on GitHub. Its main purpose is to assist developers in building a unified data collection layer. Throughout years of technological evolution, iLogtail has incorporated various design patterns. The application of these design patterns has significantly enhanced the quality and maintainability of the software. In this article, we will discuss the technical principles of some common design patterns from a practical perspective based on the iLogtail project. I would also like to express my gratitude to the ByteDance developers for their contributions in upgrading and optimizing some of the iLogtail Golang architectures.
If you have ever found learning design patterns boring, try learning them with iLogtail! You are welcome to participate in community discussions in any form. I believe you will discover that learning design patterns can be a fascinating journey!
Creational patterns provide a way to create objects while concealing the creation details. When it comes to object creation, the most familiar method is to use the new operator and then set the relevant properties. However, in many scenarios, we need to provide a more user-friendly approach for applications to create objects, especially when dealing with various complex classes.
The singleton pattern ensures that only one instance of a class can be created during the entire system lifecycle, guaranteeing the uniqueness of the class. For certain resource management scenarios like configuration management, having a global object is necessary as it facilitates coordinating the overall behavior of the system.
In iLogtail, collection configuration management plays a crucial role in connecting user collection configurations with internal collection tasks. You can create collection tasks by loading and parsing collection configurations.
The singleton pattern is well-suited for the ConfigManager, which acts as a process-level management mechanism. When iLogtail starts, it initially loads all collection configurations and supports dynamically loading changed collection configurations during the runtime process. The Singleton pattern effectively avoids the issue of status synchronization among multiple instances. It also provides a unified global interface, making it easier for each module to access it.
class ConfigManager : public ConfigManagerBase {
public:
static ConfigManager* GetInstance() {
static ConfigManager* ptr = new ConfigManager();
return ptr;
}
// Construction, destruction, copy construction, and assignment construction are private to prevent multiple objects from being constructed.
private:
ConfigManager();
virtual ~ConfigManager();
ConfigManager(const ConfigManager&) = delete;
ConfigManager& operator=(const ConfigManager&) = delete;
ConfigManager(ConfigManager&&) = delete;
ConfigManager& operator=(ConfigManager&&) = delete;
};
The GetInstance() function is the key to the singleton pattern, which uses static variables and static functions to ensure that there is only one instance of the ConfigManager class in the application. To prevent multiple ConfigManager objects from being instantiated by copying or assigning, the copy constructor and assignment operator are defined as private and marked them as deleted.
At the same time, the Magic Static feature in the C++11 standard is used: If control enters the declaration concurrently while the variable is being initialized, the concurrent execution shall wait for the completion of the initialization, thus ensuring thread safety in concurrent programs.
If control enters the declaration concurrently while the variable is being initialized, the concurrent execution shall wait for completion of the initialization.
The factory pattern is considered the most efficient method for object creation. With this pattern, the creation logic remains hidden from the client at the time of object creation. The client only needs to inform the factory class about the object it wants to create, and the rest is handled by the factory class.
To fulfill the requirements of collecting and processing various types of observable data, iLogtail C++ defines logs, metrics, and traces in its pipelines. The pipeline events are abstracted into a common format for data streams within the pipeline. A pipeline event factory is introduced in the core/models to facilitate the creation of objects such as logs, metrics, and spans. This enables flexible data stream calls, reduces dependencies in business scenarios, and improves the extensibility of data models.
The generator pattern, also known as the builder pattern, allows for the step-by-step creation of complex objects. It provides the flexibility to use the same creation code to generate objects of different types and forms. The objects constructed using the generator pattern are typically large and complex, requiring the assembly of multiple components according to a predefined manufacturing process, similar to an automobile production line.
The generator pattern involves four roles:
• Product: This represents a complex object composed of multiple components, each with its own construction method and representation.
• Builder: Responsible for defining the abstract interface for building complex objects. It includes methods for constructing each component.
• ConcreteBuilder: Implements the Builder interface, providing the construction method for each component. It ultimately combines the components to create a complete complex object.
• Director: Manages the Builder objects and invokes their methods to construct complex objects. The Director does not create complex objects directly but builds them through the Builder objects.
The iLogtail Go pipeline can be seen as a complex production line, which is a typical application scenario for the generator pattern. In this scenario, the pipeline manager (Director) divides the pipeline construction process into multiple plugin construction processes. The PipeBuilder creates and initializes the plugins at each stage, and finally, these plugins are combined to form a complete pipeline object (Product).
The application of the generator pattern significantly enhances the extensibility and maintainability of the iLogtail plugin mechanism. It allows users to easily extend various collection and processing scenarios based on their specific needs.
The prototype pattern allows new objects to be created by copying existing objects, rather than by explicit instantiation.
The prototype pattern is often used in scenarios where a large number of similar objects need to be created. In the iLogtail data processing process, using the prototype pattern to create multiple similar pipeline event objects can effectively improve the efficiency and maintainability of data processing.
Creational patterns are generally simple and serve the purpose of creating instance objects.
• Singleton pattern: Ensures that only one instance of a class exists and provides a global access point to that instance. It is suitable for managing global shared resources to avoid competition and conflicts between multiple instances. However, implementation details should be carefully considered.
• Factory pattern: Defines an interface for creating objects and allows subclasses to decide which class to instantiate. It is suitable for creating objects with similar properties and offers flexibility.
• Generator pattern: Divides the construction process of a complex object into multiple steps. It is suitable for creating complex objects and facilitates code maintenance and extension.
• Prototype pattern: Utilizes object copying to reduce complex creation processes.
Structural patterns provide a way to organize objects to achieve combinations and interactions between them.
The adapter pattern converts the interface of one type into the desired interface type, enabling objects with incompatible interfaces to work together.
The iLogtail process consists of two parts. The first part is the main binary process written in C++, which provides functions such as control, file collection, C++ accelerated processing, and SLS sending. The second part is the Golang plugins (libPluginBase.so) that extend the processing capability and support more upstream and downstream ecosystems through the plugin system.
In iLogtail, the main implementation logic for SLS sending scenarios is in C++ Sender.cpp, which provides comprehensive reliability enhancement capabilities for sending (such as exception handling, retry, and back pressure). In the Go pipeline, SlsFlusher also needs to send the collected and processed data to SLS. Implementing the same logic on the Go plugin side may result in code redundancy. Therefore, the implementation principle of Go SlsFlusher is to forward the processed data to C++ to complete the final data transmission. However, there may be inadaptability factors in cross-language scenarios. In such cases, libPluginAdaptor.so acts as an adapter layer to bridge the Golang sending interface and the C++ sending interface.
The facade pattern is designed to provide a simple interface for program libraries, frameworks, or other complex classes. Facade classes typically shield some complex interactions of subsystems and provide a simple interface, allowing clients to focus on the functionality they really care about.
In the scenario of collecting Kubernetes logs to SLS, iLogtail automatically completes log collection configurations by supporting environment variables (aliyun_logs_{key}
). This includes creating projects, Logstores, machine groups, log collection configurations, and other SLS-related resources. There are many operations involved, which require considering factors such as configuration details, container filter items, operation sequence, and failures.
In the iLogtail Env collection scenario, only a few core configuration items need to be considered. Therefore, implementing a facade class that encapsulates the required functions and hides code details not only simplifies the current call relationship but also minimizes the impact of future backend API upgrades by modifying the implementation of the facade methods in the program.
The bridge pattern splits a large class or series of closely related classes into two separate hierarchies of abstraction and implementation that can be used separately during development. The concept is rather obscure. Let's understand it in another way. A class has two (or more) dimensions that change independently. These two (or more) dimensions can be extended independently by combining them.
In iLogtail, when you use flusher_http to send requests to different backend systems, request signing and appending auth headers operations need to be supported. The request signing algorithm may vary with the backend platform. To achieve better extensibility, iLogtail provides the extension mechanism to separate the implementation of the flusher_http plug-in from the implementation of specific sending policies, thus achieving the extensibility of Authenticator, FlushInterceptor, and RequestInterceptors.
The proxy pattern is to used to hide the implementation details of a concrete implementation class by using a proxy class. It is commonly used to add additional logic before and after the actual implementation. As the name suggests, the proxy is responsible for handling all requests from the client and it is necessary to hide the real implementation from the client.
In iLogtail, the main objective is to ensure that data is accurately sent to backend services. In the scenario of sending data to SLS, the fundamental step is to call the SDK to send the packaged data. Although the process seems simple, it requires a great deal of wisdom. Backend services are complex and constantly changing, and there are uncertain factors such as network instability, backend quota exceeded, authentication failure, occasional service unavailability, flow control, and process restart. If each data sender directly calls the SLS SDK to send data independently, a significant amount of duplicate code will be generated, resulting in increased code complexity. To address this, iLogtail introduces the Sender proxy class to enhance the reliability of direct SDK sending. The data sender only needs to call Sender::Instance()->Send, and it is considered that the data has been sent. The rest of the complex scenario processing is completed by the sender class, which ensures that the data is successfully sent to the backend system.
The proxy pattern is used for method enhancement; the bridge pattern implements system decoupling through combination; the facade pattern allows the client to call the required methods without caring about the instantiation process.
In addition, the composite pattern is used to describe data with hierarchical structures. The flyweight pattern is used to cache objects that have been created in particular scenarios to improve performance.
Behavioral patterns are responsible for efficient communication and assignment of responsibilities between objects and focus on the interaction between classes. These patterns divide responsibilities clearly, making our code clearer.
The observer pattern defines a one-to-many dependency relationship between objects, which is similar to the publish/subscribe mechanism. When the state of an observable object changes, all its dependent objects are notified and automatically handle the event. The observer pattern allows for flexible event processing and looser relationships between objects, and facilitates system extension and maintenance.
The file collection scenario can be considered as a typical application scenario of the observer pattern. To ensure collection efficiency and cross-platform support, iLogtail uses polling and inotify modes. iLogtail uses the low latency and low performance features of inotify, and uses the polling mode to ensure the overall performance of the running environment.
iLogtail uses events to trigger log reading. Polling and inotify, as two independent modules, store the create, modify, and delete events in the polling event queue and inotify event queue respectively, and finally merge them into a unified event queue.
• The polling module consists of two threads: DirFilePolling and ModifyPolling. DirFilePolling periodically traverses folders based on user configurations and adds files that meet the log collection configurations to the modify cache. ModifyPolling periodically scans the status of files in the modify cache and compares them with the previous status (Dev, Inode, Modify Time, and Size). If an update is found, generate a modify event.
• inotify is an event listening method. It listens to the corresponding directories and subdirectories according to user configuration. When the directory that is being listened to changes, the kernel generates corresponding notification events.
Finally, the LogInput module consumes the event queue, and events such as create, modify, and delete are left for the event handler to process for actual log collection.
The chain of responsibility pattern allows you to send requests along the chain of processors. Upon receiving a request, each processor can either process the request or pass it on to the next processor in the chain.
The chain of responsibility transforms specific behaviors into independent objects called processors. In a lengthy process, each step can be extracted into a class with only a single method. Perform the operation, and the request and its data are passed as parameters to the method.
The data processing pipeline in iLogtail is a very classic chain of responsibility pattern. The current main body of the plug-in system consists of four parts: input, processor, aggregator, and flusher. The processor serves as the processing layer and can filter input data. For example, it can check whether specific fields meet requirements or add, delete, and modify fields. Each configuration can configure multiple processors at the same time. They adopt a serial structure. That is, the output of the previous processor is used as the input of the next processor, and the output of the last processor is passed to the aggregator.
The memento pattern allows you to capture the internal state of an object without exposing the implementation details of the object, and save this state outside the object, so that the object can be restored to the original state later.
The memento pattern has the following parts:
• Originator: Responsible for recording the internal status at the current moment, defining the status that belongs to the backup scope, and creating and restoring memo data.
• Memento: Responsible for storing the internal status of the originator object and providing the originator with the required internal status when needed.
• Caretaker: Responsible for saving and providing the memo. However, the contents of the memento cannot be accessed or modified.
The most important feature in log collection scenarios is to ensure that logs are not lost. iLogtail uses the checkpoint mechanism to back up the status of collected files to local disks in a timely manner. This ensures data reliability in extreme scenarios.
Two typical application scenarios:
• Collection configuration update/process upgrade
During configuration updates or process upgrades, you must interrupt the collection and reinitialize the collection context. iLogtail must ensure that logs are not lost even if logs are rotated.
Solution: To ensure that log data is not lost during the configuration update or upgrade process, iLogtail saves the current collection status to the local checkpoint file before the configuration is reloaded or the process exits. After the new configuration application or process is started, iLogtail loads the last saved checkpoint and uses the checkpoint to restore the previous collection status.
• Exceptions such as process crashes and downtime
When a process crashes or fails, iLogtail must provide a fault tolerance mechanism to avoid data loss and minimize repeated collection.
Solution: The timing of recording checkpoints before a process crash or downtime does not exist. Therefore, iLogtail also periodically dumps the collection progress to local. In addition to restoring the normal log file status, it also searches for the rotated logs to minimize the risk of log loss.
The iterator pattern provides a way to access the elements of an object without exposing the internal details of the object.
The Golang plug-in uses LevelDB to back up some context resources and restore data based on the iterator pattern.
// Iterator iterates over a DB's key/value pairs in key order.
type Iterator interface {
CommonIterator
// Key returns the key of the current key/value pair, or nil if done.
// The caller should not modify the contents of the returned slice, and
// its contents may change on the next call to any 'seeks method'.
Key() []byte
// Value returns the key of the current key/value pair, or nil if done.
// The caller should not modify the contents of the returned slice, and
// its contents may change on the next call to any 'seeks method'.
Value() []byte
}
Behavioral patterns focus on the ways and patterns of communication and interaction between objects.
• Observer Pattern: Defines a one-to-many dependency relationship. When the state of an object changes, all its dependents will be notified and automatically updated.
• Chain of Responsibility Pattern: Decouples the sender and receiver of a request so that multiple objects have the opportunity to process the request until one of the objects successfully processes it.
• Memento Pattern: Allows saving and restoring the previous state of an object without exposing its implementation details.
• Iterator Pattern: Provides a unified way to access each element in an aggregate object without exposing its internal structure.
Disclaimer: The views expressed herein are for reference only and don't necessarily represent the official views of Alibaba Cloud.
Alibaba Cloud's Young Power Journey: From One Day to Day One
Clean Code - Be a Thinking Programmer Instead of a Code Farmer
1,044 posts | 257 followers
FollowAlibaba Cloud Community - December 22, 2023
Alibaba Cloud Native Community - August 2, 2024
Alibaba Cloud Native Community - March 29, 2024
Alibaba Cloud Native Community - August 14, 2024
Alibaba Cloud Community - August 2, 2022
Alibaba Container Service - October 13, 2022
1,044 posts | 257 followers
FollowPlan and optimize your storage budget with flexible storage services
Learn MoreA cost-effective, efficient and easy-to-manage hybrid cloud storage solution.
Learn MoreProvides scalable, distributed, and high-performance block storage and object storage services in a software-defined manner.
Learn MoreBuild a Data Lake with Alibaba Cloud Object Storage Service (OSS) with 99.9999999999% (12 9s) availability, 99.995% SLA, and high scalability
Learn MoreMore Posts by Alibaba Cloud Community