×
Community Blog Lindorm Edge 101

Lindorm Edge 101

This article explains the ins and outs of Lindorm Edge, a high-performance online TSDB service that can be embedded into edge devices.

Lindorm Edge is a high-performance, low-cost, stable, and reliable online time series database service that can be embedded into edge devices.

1. Overview

Lindorm Edge provides efficient read/write, high compression ratio storage, time series data interpolation, and aggregate computing services. It is widely used in the Internet of Things (IoT) equipment monitoring system, enterprise energy management system (EMS), production safety monitoring system, power detection system, and other industry scenarios.

Lindorm Edge can write millions of time series data in seconds. It provides functions, such as high compression ratio and low-cost storage, pre-drop sampling, multidimensional aggregation calculation, and visualization of query results. It solves problems, such as high storage cost, poor writing performance, low query, and analysis efficiency due to the huge number of equipment collection points and high data collection frequency.

2. Product Advantage

Lindorm Edge is a high-performance, low-cost, easy-to-use, and fast operations and maintenance (O&M) database that provides:

  • Excellent Performance

It provides efficient read and write capabilities. The read/write efficiency has improved several times over when comparing it with open-source OpenTSDB.

  • Low Storage Costs

It compresses raw data effectively based on an efficient compression algorithm, saving up to 90% of the storage space.

  • Easy to Use

The console provides a wide range of data management and O&M functions. The operation is simple and convenient, allowing you to complete daily data management and O&M easily.

  • Professional Operation and Maintenance Support

Lindorm Edge provides professional monitoring and alarm systems with quick O&M capabilities. The R&D Team features world-class database experts that provide professional technical support.

3. Product Ecosystem

The core of Lindorm Edge is the integration of inverted index, TSM data compression format, rich dimensional aggregate query, and the TICK (telegraf, Lindorm Edge, chronograf, and kapacitor) ecosystem, which supports the promql query. Inverted index accelerates multi-dimensional query. The TSM data file is used for compact compression, which reduces storage space costs and speeds up the query. The rich dimensional aggregation query helps users analyze data more conveniently. The Enterprise Edition of Edge is composed of three nodes that guarantee service reliability and data consistency through the raft conformance protocol. All three nodes can accept write and query requests; once one of the nodes fails, the remaining two nodes can still serve as normal.

1

4. Features

Lindorm Edge provides features, such as efficient read/write of time series data, data management, and high compressed storage, to ensure the stability of your business.

4.1 Efficient Reading and Writing of Time Series Data

Lindorm Edge can read and write time series data ultra-fast. The response time is less than five seconds when reading millions of pieces of data. It can write tens of millions of pieces of data per second.

  • Data Write

Lindorm Edge supports data writing through the Rest protocol.

  • Data Query

Lindorm Edge supports two methods of querying data, using the Rest protocol or from the client.

4.2 Data Management

  • Data Retention Policy Settings

You can set the data retention policy through the console or API. After the data retention policy is set, the data will be fragmented according to the shard duration, and the expired shard data will be deleted automatically.

  • Data Cleaning

You can execute data cleaning based on Measurements on the console or execute more flexible data cleaning through APIs.

4.3 Efficient Compressed Storage

Lindorm Edge uses efficient data compression technology to reduce the average storage space used by every data point to 1-2 bytes. This can reduce storage space usage by 90% and speed up data writing simultaneously.

4.4 Time Series Data Calculation Ability

Lindorm Edge provides rich time series data calculation functions, such as Aggregation, Selector, Transformation, and Predictor. It supports down-sampling and multi-dimensional aggregation of data and can meet various complex business data query scenarios.

4.5 Monitoring and O&M

Lindorm TSDB provides a real-time O&M system that displays the instance status, performance metrics, and storage usage, which helps you identify resource bottlenecks through alarm settings.

4.6 Data and Instance Security

Lindorm Edge provides the following solutions to ensure data and instance security:

  • Instance access via VPC ensures the security of instance access.
  • The IP allowlist configuration lets users set acceptable IPs to secure the instance and data access. If the IP of a client is within the VPC but not in the allowlist, its access requests will be denied.
  • Lindorm Edge Enterprise Edition adopts a three-copy strategy to guarantee data availability.

5. Time Series Data Model

2

5.1 IoTDB

IoTDB is a database designed for IoT time series data, providing data collection, storage, and analysis functions. IoTDB provides an integrated solution on the cloud with high-performance data reading and writing and multi-dimensional query capabilities. It also has an efficient directory organization structure dedicated to IoT scenarios and is compatible with big data systems, such as Apache Hadoop, Spark, and Flink. At the edge, IoTDB provides lightweight TsFile management capabilities, writing data generated by edge devices to the local TsFile and providing specific basic query capabilities. Users can sync the TsFile data to the cloud simultaneously.

3

A tree structure is implemented through the organization of IoTDB metadata. An instance contains multiple Storage Groups (similar to the concepts of Namespace and Database), and a Storage Group contains multiple Devices. Each Device contains multiple Measurements, and the time series data corresponding to each Measurement is eventually stored in TsFile Chunk. In addition, to manage data duration, the data of each Storage Group will be segmented and stored in different directories based on a time range. It uses one week as a unit by default.

4

5.2 TimescaleDB

TimescaleDB is a time series database extended based on PostgreSQL. The data is written in chunks automatically by time and space.

5

The data in TimescaleDB must be displayed as two-dimensional tables, which require users to design and organize their business data this way.

There are two paradigms in the TimescaleDB official documentation that describes how to design tables using time series data.

  • Narrow Table
  • Wide Table

For Narrow Table, the metrics are recorded separately. One row represents one record, and each row only contains one metricvalue-timestamp. Examples:

6

For Wide Table, the data is stored according to the timestamp. In this manner, you can store multiple metrics associated with a device in the same row. In general, the attributes (metadata) of a device can even be stored in other tables as auxiliary data for a single record. These kinds of data should be queried with the JOIN statement when needed.

7

6. Writing and Out-of-Order Processing

  • The LSM tree structure is adopted to handle out-of-order writing. When a write request is received, the data will be recorded in the WAL through the write API for failure recovery.
  • When the WAL is written, the data will be written into the memory cache. Non-sorted writes are supported to ensure write performance.
  • When the cache exceeds a certain size, it will be compressed in the memory and written to the disk as an immutable TSM file.
  • The process that dumps the immutable TSM files in the memory to the TSM layer on the hard disk is also called compaction. Note: The TSMs at the L1 layer are not merged, so the range of the key in multiple TSMs can be the same, but not when the layer's number is greater than one.
  • When the volume of the TSM on each layer exceeds a certain size or number, it will also be compressed periodically, which is also called compaction. In this stage, data that has been marked for deletion will be truly removed, and multiple versions of data will be merged to avoid space wasting.
  • When a read request is received, it will check the file list snapshot in the manifest first and then query in the memory and return results if hits.
  • If there are no returned results, it will query all Levels one by one until it gets the final results.

8

7. Encoding and Compression - Adaptive Intelligent Algorithm

9

Lindorm Edge uses adaptive intelligent algorithms. Multiple encoding, different compress algorithms and five types of data are used including timestamp, floating, integer, Boolean, and string. In this section, XOR encoding algorithm will be introduced.

XOR algorithm for encoding is roughly as follows.

  1. No compression is done when the first value is stored.
  2. Each value generated later is XORed with the previous value
    If the XOR value is 0, that is, the two values are the same, then it is stored as 0 and only one bit is occupied.
    If XOR is non-zero, then the first bit is stored as 1, and the calculation XOR is located in the sum number of front end and rear end zeros , namely Leading Zeros and Trailing Zeros.
    1) If Leading Zeros and Trailing Zeros are the same as the previous XOR value, the second bit value is stored as '0', and immediately after Leading Zeros and Trailing Zeros are removed, only the valid XOR value is stored.
    2.1) If Leading Zeros and Trailing Zeros are the same as the previous XOR value, the second bit value is stored as '0', and then the effective XOR value part after Leading Zeros and Trailing Zeros is removed.
    2.2) If Leading Zeros and Trailing Zeros are different from the previous XOR value, then the second bit value is stored as '1', followed by 5 bits to describe the value of Leading Zeros, and 6 bits to describe the length of the effective XOR value , and finally store the effective XOR value part. (In this case, at least 13 bits of redundant information are generated.)

The following is an example for XOR compression encoding of 1,600,000 Point value sampling data in Lindorm.

10

These are the information we can get from the results.

  • The proportion of value that only occupies 1 bit is as high as 59.06%, which means that more than half of the Point value has not changed from the previous value.
  • 30% of values occupies an average of 26.6 bits, which is the case mention in 2.1.
  • The remaining 12.64% of values occupies an average of 39.6 bits, which is the case mentioned in 2.2.

In addition, the XOR encoding algorithm is particularly effective for integer Point values.

8. Cloud Edge Integration

The edge of Lindorm Edge realizes batch synchronization with the cloud through the File upload API and transmits the encoded TSM files directly. This approach has a high compression ratio, which saves bandwidth, but has high (minute or hour level) latency. The internal data on the edge is synchronized in real-time with the Replication Manager through WAL. The file hard link is used, which does not increase the storage cost. By watching WAL, parsing logs, and uploading via the http protocol, a high level of real-time performance is achieved with only seconds (or even millisecond) delays, but it requires high bandwidth and consumes more resources.

11

9. Typical Scenarios

Lindorm Edge is widely used in industries and scenarios, such as Internet of Things (IoT) equipment monitoring systems, enterprise energy management systems (EMS), production safety monitoring systems, and power detection systems.

  • IoT Device Monitoring and Analysis

IoT devices generate massive amounts of device status data and business message data all the time. These data are helpful for device monitoring, business analysis and prediction, and fault diagnosis. The device sends the raw data to the IoT suite through the MQTT protocol, forwards the data to the message service system through the IoT suite, and performs real-time calculation processing on the data through the streaming computing system. Then, it writes the data to Lindorm Edge or writes raw data directly to Lindorm Edge through the IoT suite. The frontend monitoring system and big data processing system will use Lindorm Edge's data query and calculation analysis capabilities for business monitoring and real-time display of analysis results.

12

  • Power Chemical Industry and Industrial Manufacturing Monitoring Analysis

Traditional power chemical and industrial manufacturing equipment requires real-time monitoring systems for equipment status detection, fault discovery, and business trend analysis.

The equipment connects its status and production business data to the industrial equipment gateway through the industrial interface protocol and sends it to the IoT suite through the MQTT protocol. Then, it transmits it to the message service system on the cloud and processes it with the stream computing system before being written to Lindorm Edge, thus completing the storage and analysis of time series data.

13

  • System O&M and Real-Time Business Monitoring

You can understand equipment status, resource utilization, and business trends in real-time and realize digital operations, automated development, and O&M by monitoring large-scale application clusters and equipment in the data center.

By collecting raw data and performing real-time calculations through logs and other methods, the results of the real-time calculations are stored in Lindorm Edge to realize monitoring, analysis, and display.

14

0 0 0
Share on

ApsaraDB

443 posts | 93 followers

You may also like

Comments

ApsaraDB

443 posts | 93 followers

Related Products

  • IoT Platform

    Provides secure and reliable communication between devices and the IoT Platform which allows you to manage a large number of devices on a single IoT Platform.

    Learn More
  • IoT Solution

    A cloud solution for smart technology providers to quickly build stable, cost-efficient, and reliable ubiquitous platforms

    Learn More
  • Link IoT Edge

    Link IoT Edge allows for the management of millions of edge nodes by extending the capabilities of the cloud, thus providing users with services at the nearest location.

    Learn More
  • Global Internet Access Solution

    Migrate your Internet Data Center’s (IDC) Internet gateway to the cloud securely through Alibaba Cloud’s high-quality Internet bandwidth and premium Mainland China route.

    Learn More