This topic provides reading recommendations based on your roles.
MaxCompute beginners
If you are a beginner in MaxCompute, we recommend that you first familiarize yourself with the modules described in the following table.
Module | Description |
Provides an overview of MaxCompute and describes the features, scenarios, limits, and basic concepts of MaxCompute. This module helps you obtain a general knowledge of MaxCompute. | |
Describes how to create an account, prepare an environment, create a table, import data, run SQL jobs, and export returned data. | |
Describes the commonly used commands in MaxCompute. This module helps you familiarize yourself with operations on MaxCompute. | |
Describes the common tools in MaxCompute, such as the MaxCompute client and MaxCompute Studio. Before you analyze data, you must familiarize yourself with the tools. | |
Describes the network connection modes supported in different regions and the endpoints that correspond to each region. This module also describes the issues that may occur when MaxCompute is connected to other Alibaba Cloud services, such as Elastic Compute Service (ECS), Tablestore, and Object Storage Service (OSS). These issues include network connectivity issues and issues related to data download charges. |
Data analysts
If you are a data analyst, we recommend that you familiarize yourself with the SQL topics. You can query and analyze large volumes of data stored in MaxCompute. The following table describes the features that are provided by MaxCompute SQL.
Feature | Description |
Allows you to manage tables, partitions, columns, lifecycles, and views. | |
Allows you to insert data into or update data in tables or partitions. | |
Allows you to perform various query operations, such as SELECT and subqueries. | |
Allows you to perform SQL enhancement operations, such as importing and exporting data from MaxCompute tables and cloning table data, by using commands. | |
Allows you to process data by using MaxCompute built-in functions, such as the mathematical functions, window functions, date functions, aggregate functions, and string functions. | |
Allows you to create user-defined functions (UDFs) to meet your computing requirements. |
Users with development experience
If you have development experience, understand the distributed architecture, and want to obtain data analytics capabilities that SQL cannot deliver, we recommend that you familiarize yourself with advanced functional modules of MaxCompute.
Module | Description |
MaxCompute provides the MapReduce programming model in Java. You can use the Java API provided by MapReduce to write MapReduce programs and process data in MaxCompute. | |
Graph is a processing framework for iterative graph computing. A graph consists of vertices and edges, both of which contain values. MaxCompute Graph iteratively edits and evolves graphs to obtain analysis results. | |
MaxCompute Tunnel enables you to upload or download large amounts of data to or from MaxCompute at a time. | |
MaxCompute provides an SDK for Java for developers. | |
MaxCompute provides an SDK for Python for developers. |
Project owners or administrators
If you are a project owner or administrator, we recommend that you familiarize yourself with the modules described in the following table. A project owner can create and use projects, and a project administrator can manage projects, security operations, and costs.
Module | Feature | Description |
Project management | Prepare for project creation | A project is a basic organizational unit of MaxCompute. Similar to a database or schema in a traditional database system, a project is used to isolate users and control access requests. A user can have permissions on multiple projects. After a user is granted the related permissions, the user can access objects, such as tables, resources, functions, and instances, across projects. MaxCompute is used to manage various objects in projects. You must make the following preparations before you create a project:
|
Create a project | For more information, see Create a MaxCompute project. | |
Manage project members | Members are managed based on member responsibilities and security requirements. If you use MaxCompute in the DataWorks console, you must understand the permission relationships between MaxCompute and DataWorks. | |
Manage RAM users | You can manage MaxCompute projects by using your Alibaba Cloud account or the credentials of a RAM user. You can add RAM users of your Alibaba Cloud account to a MaxCompute project. For more information about RAM users, see Prepare a RAM user. If you manage MaxCompute projects and DataWorks workspaces in the DataWorks console, you can add only RAM users of your Alibaba Cloud account as members. Therefore, you must use your Alibaba Cloud account to create RAM users and manage these RAM users in the Resource Access Management (RAM) console. Note
| |
Manage scheduling resources | You are required to manage the scheduling resources of DataWorks. These resources are used to execute or distribute the tasks that are delivered by the scheduling system. Scheduling resources of DataWorks are categorized into the following types:
| |
Configure projects | Only the owner of a project has the permissions to configure the project. For example, the project owner can specify whether to enable full table scan and whether to enable the MaxCompute V2.0 data type edition. For more information, see Project operations. | |
Cost management | None | Budgets for resources help you estimate costs before you use the resources. It is difficult to estimate the precise costs due to the different billing methods of MaxCompute. You must manage costs during the entire business development process.
|