All Products
Search
Document Center

MaxCompute:Data lake analytics

Last Updated:Jan 15, 2026

Tutorials

Document link

Introduction

Data transformation and multi-scenario orchestration on a data lake using MaxCompute

Use MaxLake to ingest data into a data lake and warehouse and enable multi-scenario analytics. This tutorial uses Internet of Vehicles (IoV) data to show how to analyze mileage and speed from vehicle GPS information. It also explains how to orchestrate multiple engines to support real-time query reports, cross-team collaboration, desensitized data sharing, and AI training. This method lets you derive multiple values from a single copy of data.

Read CSV data from a data lake using DLF 1.0 and OSS

Configure Data Lake Formation (DLF) to extract metadata from Object Storage Service (OSS). Then, use a MaxCompute external schema to run federated queries on the data lake. This solution simplifies data analysis and processing while ensuring data reliability and security.

Read Paimon data from a data lake using DLF 1.0 and OSS

Use Flink to create a Paimon DLF catalog. Read MySQL Change Data Capture (CDC) data and write it to OSS. Then, synchronize the metadata to DLF. Finally, use a MaxCompute external schema to run federated queries on the data lake.

Read Parquet data from a data lake using a schemaless query

This tutorial uses an E-MapReduce serverless Spark cluster as an example. It shows how to use a schemaless query in MaxCompute to read Parquet files generated by Spark SQL. After the computation is complete, you can use the UNLOAD command to write the results back to OSS.

Read Hadoop Hive data using HMS and HDFS

This tutorial uses Hive on E-MapReduce as an example. It shows how to create an external schema in MaxCompute and query Hive table data in Hadoop.

Create metadata mapping and data synchronization for Hologres

This tutorial demonstrates how to use MaxCompute to create metadata mapping and data synchronization for Hologres.

Read and write Paimon data on a data lake using an external project and a FileSystem Catalog

Use Flink to create a Paimon catalog and generate data. Then, use MaxCompute to create an external project based on the FileSystem Catalog to directly read the Paimon table data.

(Invitational preview) Use an external project to read and write Paimon data on a data lake using DLF

Use Flink to create a Paimon DLF catalog. Read MySQL CDC business data and write it to DLF. Then, use a MaxCompute external project to run federated queries and analysis on the data lake and write the results back to DLF. This topic uses the new version of DLF, which is different from the previous DLF 1.0.