Data Lake Formation (DLF) allows you to view information such as storage usage, metadata objects, storage trends, storage class distribution, storage format distribution, and file distribution. The information helps you quickly understand storage resource usage, identify issues, and perform optimization accordingly.
Prerequisites
Object Storage Service (OSS) is activated.
Location hosting is complete in DLF.
Enable storage overview
Log on to the DLF console, choose Lake Management > Storage Overview in the left-side navigation pane, and then click Enable to enable the Storage Overview feature.
If you enable this feature, OSS buckets of metadatabases are written to statistical files. You are charged for the storage of these files.
No statistical data is generated on the day you enable storage overview. You can view statistical data on the next day.
Feature description
Metadata analysis
Summary of resources
Total storage space used and monthly and daily changes: the total OSS storage space used for storing tables that are displayed on the Metadata page.
Total number of tables and monthly and daily changes: the total number of tables that are displayed on the Metadata page.
Total number of databases and monthly and daily changes: the total number of databases that are displayed on the Metadata page.
Monthly and daily API visits: the number of API visits of the current month (calendar month).
Trend change
This section displays the trend charts of the storage capacity, table quantity, database quantity, and API visits.
You can select a time period for the query.
Rankings of table and database storage
This section displays the rankings of the OSS storage space used for tables and databases. You can optimize the top-ranked tables and databases based on your business requirements.
Storage class distribution
This section displays the distribution of OSS storage classes. OSS provides the following storage classes: Standard, Infrequent Access (IA), Archive, and Cold Archive. You can select storage classes that are suitable for different business data based on your needs to optimize storage costs.
DLF also provides the lifecycle management feature to allow automatically archiving data in data lakes.
Storage format distribution
This section displays the storage format distribution of tables.
File distribution and rankings of small files (including ultra-small files)
This section displays the distribution of files at different size levels and rankings of small files (including ultra-small files). This helps you optimize the tables with a large number of small files based on your business requirements to improve query performance.