AnalyticDB for PostgreSQL supports two storage models: row store and column store. When you create a table, you can specify the row store model to create a row-oriented table or specify the column store model to create a column-oriented table.
Row-oriented table
By default, row-oriented tables are created in AnalyticDB for PostgreSQL. Row-oriented tables use the same storage architecture as the heap tables of PostgreSQL. Row-oriented tables provide better performance in online transaction processing (OLTP) scenarios where one or several rows of data are frequently inserted and then modified or deleted. If you need small tables, such as dimension tables, we recommend that you use row-oriented tables. When B-tree indexes are used, row-oriented tables can provide better point lookup query performance.
Execute the following statement to create a row-oriented heap table named foo:
CREATE TABLE foo (a int, b text)
DISTRIBUTED BY (a);
Column-oriented table
Data within a column-oriented table is stored by column. When you access data, only relevant columns are read. Column-oriented tables can provide efficient I/O operations in data warehousing scenarios such as data queries and aggregations of a small number of columns. However, column-oriented tables are less efficient in scenarios where data is frequently inserted or updated. We recommend that you use a batch loading method such as COPY to insert data into column-oriented tables. Column-oriented tables can provide a data compression ratio up to five times higher than that provided by row-oriented tables.
A column-oriented table must be an append-optimized (AO) table. When you create a column-oriented table, you must set appendonly to true.
Execute the following statement to create a column-oriented table named bar:
CREATE TABLE bar (a int, b text)
WITH (appendonly=true, orientation=column)
DISTRIBUTED BY (a);
Select the storage model for a table
The selection of storage model is determined based on data workloads. When you select a storage model, take note of the following items:
- For most common-purpose or mixed workloads, row-oriented tables can provide better flexibility and performance.
- If you need to frequently insert, modify, or delete one or several rows of data, we recommend that you select row-oriented tables.
- If you need to import large amounts of data that is seldom modified or deleted, we recommend that you select row-oriented tables.
- If you need to perform projection queries or conditional statements on all or most columns or if the row size is small, we recommend that you select row-oriented tables. If a subset of many columns are frequently queried, we recommend that you select column-oriented tables.
- Column-oriented tables use the same data type in each column, which makes it easier to compress similar and adjacent data. Therefore, column-oriented tables occupy less storage than row-oriented tables. However, easier compression brings more difficult random queries.
Modify the storage model of a table
The storage model can be specified only when tables are created. To change the storage model of a table, you must create another table by using your desired storage model and import data from the original table to the new table. Then, you can delete the original table and rename the new table the name of the original table. You must also assign permissions of the original table to the new table. This process is not atomic. To ensure data integrity, you must set the original table to read-only during this process.
Execute the following statements to change the storage model of the foo table from row store to column store:
CREATE TABLE foo_tmp (LIKE foo) WITH (appendonly=true, orientation=column);
INSERT INTO foo_tmp SELECT * FROM sales;
DROP TABLE foo;
ALTER TABLE foo_tmp RENAME TO foo;
GRANT ALL PRIVILEGES ON foo TO user1;
GRANT SELECT ON foo TO user2;