TSDBs

Updated at: 2024-12-06 07:37

Time Series Databases (TSDBs) are specially designed to efficiently store and process time series data. Time series data is a sequence of data points arranged in chronological order. Each data point usually contains a timestamp and one or more values (metrics). Time series data is widely used in fields such as Internet of Things (IoT), monitoring systems, financial transactions, industrial control, sensor networks, meteorological records, and server metric monitoring.

Background information

Time series databases were introduced for the following reasons:

  • The emergence of Internet of Things (IoT): With the increasing popularity of IoT technology, a large number of devices and sensors are deployed around the world. They continuously generate time series data, such as temperature, humidity, and device status. Such data is usually time-stamped and very huge in size. This makes it difficult for traditional relational databases to store and process such data efficiently.

  • Growing demand for monitoring and log management: Monitoring of systems and applications is becoming increasingly important in fields such as IT, industrial control, and energy management. These applications need to collect and analyze time series data to discover anomalies, performance bottlenecks, and trends to assist decision-making. Traditional databases are inadequate for dealing with highly concurrent writes, data compression, and efficient query of time series data.

  • Big data analysis and forecasting needs: With the development of data analysis technology, enterprises and research institutions focus more on the analysis of historical time series data for predictive maintenance, market trend prediction, climate model prediction. This requires the database not only to store large-scale time series data efficiently, but also to support complex analysis and queries of time series data.

  • Resource optimization and cost control: In many scenarios, the cost of storing and processing time series data is an important consideration. Time series databases use technologies such as data compression, efficient indexing, and storage optimization to significantly reduce storage and computing resource costs while ensuring query efficiency.

Therefore, time series databases were introduced to store time series data. It is optimized based on the characteristics of time series data, such as high write throughput, data compression, and time range query optimization, to better meet the requirements of those scenarios.

Features

GanosBase TSDB is a time series database implemented on PolarDB for PostgreSQL. GanosBase TSDB is fully compatible with open-source TimescaleDB Apache 2.0. It also provides advanced features such as continuous aggregates, time series compression, and statistical analysis. Its core features include:

  • High-performance reads and writes: GanosBase TSDB can process tens of thousands of data point writes per second and provides fast historical data retrieval capabilities to meet real-time monitoring and analysis requirements.

  • Continuous aggreges: GanosBase TSDB supports custom aggregation views. In these views, raw time series data is aggregated and calculated (such as the sum, average, maximum, or minimum) at a specified time interval (such as every minute, every hour, or every day), and the results are stored in a separate materialized view. This process is automated. After the configuration is complete, GanosBase TSDB updates these aggregate views in the background at a regular interval or based on data changes to ensure that the aggregated data is updated in real time.

  • Cross-modal data processing: GanosBase TSDB supports integrated storage and retrieval of data in different modalities, such as time series and spatio-temporal data. Data can be partitioned based on time and space. During queries, data can be automatically partitioned and pruned to accelerate retrieval efficiency.

  • Low-cost storage: GanosBase TSDB uses optimized storage structures and compression algorithms to reduce storage space consumption. It supports deduplication and high-compression storage for similar or duplicate data, reducing storage space by more than 70%. GanosBase TSDB is interoperable with OSS to archive cold data and greatly reduce storage costs. This is important for large amounts of time series data.

  • Data retention policies: GanosBase TSDB supports flexible data retention policies and can automatically delete expired data based on business requirements. This helps control data storage costs and keeps databases running efficiently.

  • Scability and high availability: For large applications, GanosBase TSDB relies on PolarDB to provide horizontal scaling capabilities and failover mechanisms to ensure system stability and reliability.

In summary, GanosBase TSDB can process time series data with its unique design and optimizations. It is indispensable for large-scale monitoring, analysis, and prediction tasks.

Scenarios

GanosBase TSDB is mainly used in the following scenarios:

  • IoT: A large number of devices (such as smart sensors) steadily generate monitoring data such as temperature, humidity, and pressure. Such data has time series characteristics and need to be collected, stored and analyzed in real time to achieve remote monitoring and fault early warning.

  • Monitoring systems: implement IT infrastructure monitoring (such as server performance, network traffic) and application performance management. These systems need to record changes in system metrics over time to quickly discover problems and optimize performance.

  • Financial services: Data on stock prices, trading volumes, and exchange rates in financial markets are typical time series data. Time series databases can be used for high-frequency trading analysis, market trend prediction and risk management.

  • Energy and utilities: The operational data and energy consumption data of smart gird and hydropower stations are stored and analyzed using time series databases to help improve energy allocation efficiency and forecasting.

  • Industrial automation: Equipment status monitoring and production process control in the manufacturing industry generate a large amount of time series data. Time series databases implement equipment health management and capacity optimization.

  • On this page (1, T)
  • Background information
  • Features
  • Scenarios
Feedback