All Products
Search
Document Center

MaxCompute:Flag parameters

最終更新日:Dec 04, 2024

This topic provides details about flag parameters, including the parameter name, parameter description, and scope of impact. You can search for parameters on the Parameters page.

Category

Scope of impact

Parameter

Description

Valid values

Affected SQL statements and functions

Ecosystem

Session

console.sql.result.instancetunnel

Specifies whether to enable InstanceTunnel. For more information about Tunnel commands, see Usage notes.

  • True: enabled.

  • False: disabled.

SELECT

Scheduling

Session

odps.stage.mapper.mem

The memory size of each Map worker.

Valid values: 256 MiB to 12288 MiB. Default value: 1024 MiB.

ALL

Scheduling

Session

odps.stage.reducer.mem

The memory size of each Reduce worker.

Valid values: 256 MiB to 12288 MiB. Default value: 1024 MiB.

ALL

Scheduling

Session

odps.stage.joiner.mem

The memory size of each Join worker.

Valid values: 256 MiB to 12288 MiB. Default value: 1024 MiB.

ALL

Scheduling

Session

odps.stage.mem

The total memory size of all workers in a specified MaxCompute job. This parameter has a lower priority than the odps.stage.mapper.mem, odps.stage.reducer.mem, and odps.stage.joiner.mem parameters.

Valid values: 256 MiB to 12288 MiB. No default value.

ALL

Scheduling

Session

odps.stage.mapper.split.size

The input data amount of each Map worker, which is the split size of the input file. You can use this parameter to indirectly control the number of workers at each Map stage.

Note

If the LIMIT keyword is used in SQL statements, the number of concurrent workers is limited to 1. This way, only a single worker is allowed to run at a time. In this case, when you configure this parameter, we recommend that you do not add the LIMIT keyword to SQL statements.

Unit: MiB. Default value: 256 MiB.

ALL

Scheduling

Session

odps.sql.split.size

The size of a single shard of data for table-level parallel processing.

Unit: MiB.

For example, {"table1": 1024, "table2": 512}.

ALL

Scheduling

Session

odps.sql.split.row.count

The number of rows in a single shard for table-level parallel processing.

Note

Can only be used for internal tables, non-transactional tables, and non-clustered tables.

Valid values: The value of this parameter must be a positive integer.

For example, {"table1": 100, "table2": 500}.

ALL

Scheduling

Session

odps.sql.split.dop

The parallelism level for table-level processing.

Note

Can only be used for internal tables, non-transactional tables, and non-clustered tables.

Valid values: The value of this parameter must be a positive integer.

For example, {"table1": 1, "table2": 5}.

ALL

Scheduling

Session

odps.stage.reducer.num

The number of workers at each Reduce stage.

Note

If the LIMIT keyword is used in SQL statements, the number of concurrent workers is limited to 1. This way, only a single worker is allowed to run at a time. In this case, when you configure this parameter, we recommend that you do not add the LIMIT keyword to SQL statements.

-

ALL

Scheduling

Session

odps.stage.joiner.num

The number of workers at each Join stage.

Note

If the LIMIT keyword is used in SQL statements, the number of concurrent workers is limited to 1. This way, only a single worker is allowed to run at a time. In this case, when you configure this parameter, we recommend that you do not add the LIMIT keyword to SQL statements.

-

ALL

Scheduling

Session

odps.stage.num

The total number of concurrent workers in a specified MaxCompute job. This parameter has a lower priority than the odps.stage.mapper.split.size, odps.stage.reducer.num, and odps.stage.joiner.num parameters.

Note

If the LIMIT keyword is used in SQL statements, the number of concurrent workers is limited to 1. This way, only a single worker is allowed to run at a time. In this case, when you configure this parameter, we recommend that you do not add the LIMIT keyword to SQL statements.

-

ALL

Scheduling

Project

odps.instance.priority.enable

Specifies whether to enable the job priority feature for a project that uses subscription computing resources. For more information about the job priority feature, see Job priority.

  • True: enabled.

  • False: disabled.

Note

Only the project owner or a user who is assigned the Super_Administrator role can run this command to enable the priority feature.

ALL

SQL

Session

odps.sql.reshuffle.dynamicpt

Specifies whether to enable dynamic partitioning. Dynamic partitioning prevents the generation of a large number of small files.

  • True: enabled.

  • False: disabled.

Note

If a small number of dynamic partitions are generated, we recommend that you set this parameter to False to prevent data skew.

  • INSERT INTO

  • INSERT OVERWRITE

SQL

Session

odps.sql.udf.getjsonobj.new

Specifies whether the GET_JSON_OBJECT function retains the original string when it returns a value.

For MaxCompute projects created on or after January 21, 2021, the GET_JSON_OBJECT function retains the original strings for the returned results by default. For MaxCompute projects created before January 21, 2021, the GET_JSON_OBJECT function returns JSON reserved characters by using escape characters by default. This prevents the negative impact on existing jobs.

For more information about the GET_JSON_OBJECT function, see String functions.

  • True: enabled.

  • False: disabled.

UDF

SQL

Session

odps.sql.udf.jvm.memory

The maximum memory size of the Java Virtual Machine (JVM) heap for a user-defined function (UDF).

If a large amount of data is calculated and sorted in the memory by using a UDF, an out of memory (OOM) error may occur. In this case, you can increase the value of this parameter. However, you must optimize UDF code based on your business requirements to prevent OOM errors.

Valid values: 256 MiB to 12288 MiB. Default value: 1024 MiB.

UDF

SQL

Session

odps.function.timeout

The timeout period of a UDF.

Valid values: 0s to 3600s. Default value: 600s.

UDF

SQL

Session

odps.sql.session.resources

The resources that are referenced by a user-defined type (UDT). You can specify multiple resources and separate them with commas (,). For more information about referenced resources, see Overview.

Uploaded resources.

UDT

SQL

Session

odps.sql.udt.display.tostring

Specifies whether to enable the java.util.Objects.toString(...) mechanism on the Wrap for all the columns whose outputs are UDTs.

  • True: enabled.

  • False: disabled.

UDT

SQL

Session

odps.sql.session.java.imports

Java packages that are referenced by UDTs. You can specify multiple Java packages and separate them with commas (,). For more information about Java packages, see Overview.

Uploaded Java packages.

UDT

SQL

Session

odps.sql.skewjoin

Specifies whether to enable the SKEWJOIN feature to prevent long tails.

  • True: enabled.

  • False: disabled.

  • SELECT

  • JOIN

SQL

Session

odps.sql.skewinfo

The key and value on which SKEWJOIN is executed. For more information, see Data skew tuning.

-

  • SELECT

  • JOIN

SQL

Session

odps.sql.udf.ppr.deterministic

Specifies whether to enable partition pruning for UDFs. For more information about partition pruning, see WHERE clause (where_condition).

  • True: enabled.

  • False: disabled.

UDF

SQL

Session

odps.sql.udf.ppr.to.subquery

Specifies whether to ignore the error that occurs during result backfilling when partition pruning is being performed. For more information about partition pruning, see the "WHERE clause (where_condition)" section in SELECT syntax.

  • True: enabled.

  • False: disabled.

UDF

SQL

Session

odps.optimizer.enable.range.partial.repartitioning

Specifies whether to enable the Shuffle Remove feature for range-clustered tables.

  • True: enabled.

  • False: disabled.

  • INSERT OVERWRITE

  • CREATE TABLE

SQL

Session

odps.optimizer.merge.partitioned.table

When a query repeatedly uses the same partitioned table, you can set this parameter to enable the system to merge the partitioned table, thereby minimizing IO operations on the partitioned table and improving performance.

  • True: enabled.

  • False: disabled.

SELECT

SQL

Session

odps.optimizer.skew.join.topk.num

The number of hot key values obtained by the optimizer when it performs an aggregate operation. For more information, see SKEWJOIN HINT.

-

SKEWJOIN HINT

SQL

Session

odps.optimizer.stat.collect.auto

Specifies whether to enable the Freeride feature. After the Freeride feature is enabled, statistical data of columns in a table is automatically collected. For more information about how to collect optimizer information, see Collect information for the optimizer of MaxCompute.

  • True: enabled.

  • False: disabled.

  • CREATE TABLE

  • INSERT INTO

  • INSERT OVERWRITE

SQL

Session

odps.optimizer.stat.collect.plan

A collection plan that is used to collect specific column stats metrics of specific columns. For more information about how to collect optimizer information, see Collect information for the optimizer of MaxCompute.

-

  • CREATE TABLE

  • INSERT INTO

  • INSERT OVERWRITE

SQL

Session

odps.sql.external.net.vpc

Specifies whether to enable virtual private cloud (VPC) for external tables. For more information, see

Hologres foreign tables.

  • True: enabled.

  • False: disabled.

CREATE TABLE

SQL

Session

odps.sql.groupby.position.alias

Specifies whether to use integer constants in the GROUP BY clause as column numbers in a SELECT statement.

  • True: enabled.

  • False: disabled.

  • SELECT

  • GROUP BY

SQL

Session

odps.sql.groupby.skewindata

Specifies whether to enable the anti-skew feature for the GROUP BY clause.

  • True: enabled.

  • False: disabled.

GROUP BY

SQL

Session

odps.sql.orderby.position.alias

Specifies whether to use integer constants in the ORDER BY clause as column numbers in a SELECT statement.

  • True: enabled.

  • False: disabled.

  • ORDER BY

  • SELECT

SQL

Session

odps.sql.mapjoin.memory.max

The size of small tables that are read to the memory when the MAPJOIN statement is executed. Unit: MiB.

Note

This parameter is also subject to the total memory global limit at the task level. For multiple small tables within the same task, the total memory global upper limit equals std::min(8G, odps.sql.mapjoin.memory.max * the number of small tables in the same task).

Valid values: 0 MiB to 8192 MiB.

JOIN

SQL

Session

odps.sql.distributed.map.join.memory.max

The memory size of each instance of the HashTableBuilder when the MAPJOIN statement is executed. The default value is 2048. Unit: MiB.

Valid values: 0 MiB to 8192 MiB.

JOIN

SQL

Session

odps.sql.python.version

The Python version on which SQL statements are executed.

  • cp27

  • cp37

  • UDTF

  • UDAF

SQL

Session

odps.sql.select.output.format

Specifies whether to display the table headers in the returned results of the MaxCompute client. For more information about how to use the MaxCompute client, see MaxCompute client (odpscmd).

  • {""needHeader"":false,""fieldDelim"":""""}: The table headers are not displayed.

  • {""needHeader"":true,""fieldDelim"":""""}: The table headers are displayed.

ALL

SQL

Session

odps.sql.unstructured.data.oss.use.https

Specifies whether to enable the mechanism of obtaining data by using HTTPS at the underlying layer when you access external tables. For more information about Object Storage Service (OSS) external tables, see Create an OSS external table.

  • True: enabled.

  • False: disabled.

ALL

SQL

Session

odps.sql.decimal.tostring.trimzero

Specifies whether to remove the zero at the end of the decimal point when the data type is DECIMAL. The default value is True, which indicates to remove the zero at the end of the decimal point.

  • True: The zero at the end of the decimal point is removed.

  • False: The zero at the end of the decimal point is remained.

CAST

SQL

Session

odps.sql.unstructured.tablestore.put.row

Specifies whether to call the PutRow operation to write data to a Tablestore external table.

Note

For more information about PutRow, see PutRow.

  • True: enabled.

  • False: disabled.

ALL

SQL

Session

odps.sql.unstructured.external.max.dop

The maximum concurrency of the workers that are accessed by the external table.

The value of this parameter must be a positive integer.

Commands and functions related to data queries and writes based on external tables

SQL

Session

odps.meta.exttable.stats.onlinecollect

Since the data for external queries is stored in a data lake, metadata processing cannot be performed locally due to considerations of openness. Additionally, because the data is located externally and statistical information has not been pre-collected, the query optimizer adopts a conservative strategy, leading to low query efficiency. The optimizer can discover small tables by temporarily collecting statistics on the tables during query execution. It then proactively uses Hash Join, optimizes the Join Order, reduces extensive shuffle operations, and shortens the execution pipeline, ultimately optimizing the query plan.

  • True: enabled.

  • False (default): disabled.

Data queries based on the external tables

SQL

Project

odps.sql.allow.fullscan

Specifies whether to enable a full table scan on a project. A full table scan occupies a large number of resources, which reduces data processing efficiency. Therefore, we recommend that you do not enable this feature.

  • True: A full table scan is enabled.

  • False: A full table scan is disabled.

SELECT

SQL

Project

odps.table.lifecycle

Specifies whether to configure a lifecycle for tables in a project.

  • Optional: The lifecycle clause is optional in a table creation statement. If you do not configure a lifecycle for a table, the table does not expire.

  • Mandatory: The lifecycle clause is required in a table creation statement.

  • Inherit: If you do not configure a lifecycle for a table when you create the table, the value of odps.table.lifecycle.value is used by default.

CREATE TABLE

SQL

Project

odps.table.lifecycle.value

The lifecycle of a table. Unit: days.

Valid values: 1 to 37231. Default value: 37231.

CREATE TABLE

SQL

Project

READ_TABLE_MAX_ROW

The maximum number of data records that can be returned by a SELECT statement.

Valid values: 1 to 10000. Default value: 10000.

SELECT

SQL

Project

odps.output.field.formatter

The dynamic data masking rules for SQL query results. For more information about dynamic data masking in MaxCompute, see Dynamic data masking.

Custom dynamic data masking rules.

SELECT

SQL

Project

odps.sql.acid.table.enable

Specifies whether to enable the atomicity, consistency, isolation, and durability (ACID) mechanism. For more information about ACID, see ACID semantics.

  • True: enabled.

  • False: disabled.

ALL

SQL

Session/Project

odps.io.oss.use.vipserver

Access to the Object Storage Service (OSS) can be performed by a VIPServer for external access. If this setting is not enabled, access to OSS will occur through a Server Load Balancer (SLB). The VIPServer method allows for higher traffic limits and is subject to flow control, but it also carries certain stability risks, such as query failures. You should evaluate these risks before deciding to use this option.

  • True: enabled.

  • False (default): disabled.

Data queries based on the external tables

SQL

Session/Project

odps.sql.executionengine.enable.string.to.date.full.format

Specifies whether to convert a date string that contains the hour, minute, and second parts. Default value: False.

  • True: enabled.

  • False: disabled.

CAST

SQL

Session/Project

odps.sql.executionengine.enable.rand.time.seed

If you set this parameter to False, the RAND function uses the instance ID as the seed for random number initialization. This ensures the idempotence of the function.

If you set this parameter to True, the RAND function uses the system time as the seed for random number initialization. In this case, the RAND function is no longer idempotent and cannot be used as a shuffle key. The return result varies each time you use the RAND function.

Default value: False.

  • True: enabled.

  • False: disabled.

RAND

SQL

Session/Project

odps.sql.type.system.odps2

Specifies whether to enable the MaxCompute V2.0 data type edition. For more information about data types supported in the MaxCompute V2.0 data type edition, see MaxCompute V2.0 data type edition.

  • True: enabled.

  • False: disabled.

Additional functions of MaxCompute V2.0

SQL

Session/Project

odps.sql.type.json.enable

Specifies whether to enable JSON data types. For more information about JSON data types, see Instructions for using JSON data types in MaxCompute.

  • True: enabled.

  • False: disabled.

MaxCompute JSON data types

SQL

Session/Project

odps.sql.hive.compatible

Specifies whether to enable the Hive-compatible data type edition. MaxCompute supports Hive syntaxes, such as inputRecordReader, outputRecordReader, and Serde, only after the Hive-compatible data type edition is enabled. For more information about the data types supported in the Hive-compatible data type edition, see Hive-compatible data type edition.

  • True: enabled.

  • False: disabled.

ALL

SQL

Session/Project

odps.sql.metering.value.max

The upper limit on resources that can be consumed by an SQL statement. For more information about consumption control, see Consumption control.

-

ALL

SQL

Session/Project

odps.sql.timezone

The time zone of the MaxCompute project that you accessed. For more information about time zones, see Time zone configuration operations.

-

ALL

SQL

Session/Project

odps.sql.unstructured.oss.commit.mode

Specifies whether to enable the multipart upload feature of OSS to write data to OSS external tables. For more information, see Write data to OSS.

  • True: enabled.

  • False: disabled.

INSERT OVERWRITE

SQL

Session/Project

odps.sql.groupby.orderby.position.alias

Specifies whether to use integer constants in the GROUP BY and ORDER BY clauses as column numbers in a SELECT statement.

Note

If this parameter is set to True for an existing project, data parsing or other operations may fail to be performed. You must make sure that the original logic can be correctly executed for the existing project when you set this parameter to True. Otherwise, specify this parameter for sessions.

  • True: enabled.

  • False: disabled.

  • GROUP BY

  • ORDER BY

  • SELECT

SQL

Session/Project

odps.ext.oss.orc.native

Specifies whether to use a native Optimized Row Columnar (ORC) reader to read data from tables.

  • True: enabled.

  • False: disabled.

SELECT

SQL

Session/Project

odps.sql.job.max.time.hours

The maximum duration in which a single job runs.

Note

If you need to configure the parameter at the project level, you can click the link or search for the DingTalk group (ID: 11782920) to contact MaxCompute technical support.

Valid values: 1 to 72. Default value: 24.

SQL jobs

Billing rules

Session

odps.task.quota.preference.tag

The quota group that is allocated to a specified job. You can configure the quota group on MaxCompute Management. For projects that use subscription resources, you can configure this parameter to specify a level-2 quota group that can be used by a job. A job is preferentially scheduled to a quota group whose quota tag is the same as the quota tag of the job. You must set a quota tag for a job when you submit the job. If no quota group has the same quota tag, the job is scheduled to the quota group specified by the project to which the job belongs.

You can execute the following statement to configure a quota tag:

set odps.task.quota.preference.tag = payasyougo
Note

This parameter is available only for SQL jobs.

tag_name specifies the quota tag of the quota group that you configure on MaxCompute Management. You can configure tag_name only for quota groups in the region where the project of your job belongs. The value of tag_name can contain letters, digits, and underscores (_).

ALL

Billing rules

Session

odps.task.wlm.quota

The name of the quota that is used to run jobs in a session.

-

ALL

Security and permissions

Project

odps.forbid.fetch.result.by.bearertoken

Specifies whether to display the results of jobs on the Result tab of LogView. This parameter is used to protect data security.

  • True: The execution results of jobs are not displayed on the Result tab of LogView.

  • False: The execution results of jobs are displayed on the Result tab of LogView.

ALL

Security and permissions

Project

LabelSecurity

Specifies whether to enable LabelSecurity. For more information about LabelSecurity, see Label-based access control.

  • True: enabled.

  • False: disabled.

ALL

Security and permissions

Project

CheckPermissionUsingACL

Specifies whether to enable ACL-based access control. For more information about ACL-based access control, see MaxCompute permissions.

  • True: enabled.

  • False: disabled.

ALL

Security and permissions

Project

CheckPermissionUsingPolicy

Specifies whether to enable policy-based access control. For more information about policy-based access control, see Policy-based access control.

  • True: enabled.

  • False: disabled.

ALL

Security and permissions

Project

ObjectCreatorHasAccessPermission

Specifies whether creators of objects can perform operations on the objects created by themselves.

  • True: Creators of objects can perform operations on the objects that are created by themselves.

  • False: Creators of objects cannot perform operations on the objects created by themselves.

ALL

Security and permissions

Project

ObjectCreatorHasGrantPermission

Specifies whether creators of objects can authorize other users to perform operations on the objects.

  • True: Creators of objects can authorize other users to perform operations on the objects.

  • False: Creators of objects cannot authorize other users to perform operations on the objects.

ALL

Security and permissions

Project

ProjectProtection

Specifies whether to enable project data protection. For more information about project data protection, see Project data protection.

  • True: enabled.

  • False: disabled.

ALL

Security and permissions

Project

odps.output.field.formatter

The dynamic data masking rules for SQL query results. For more information about dynamic data masking, see Dynamic data masking.

Custom dynamic data masking rules.

SELECT

Permission security

Project

odps.security.ip.whitelist

The whitelist of IP addresses that are authorized to access the project over the cloud product interconnection network. For more information about IP address whitelists, see Manage IP address whitelists.

The list of IP addresses. Separate IP addresses with commas (,).

ALL

Permission security

Project

odps.security.vpc.whitelist

The whitelist of IP addresses that are authorized to access the project over a specific virtual private cloud (VPC). For more information about IP address whitelists, see Manage IP address whitelists.

RegionID_VPCID[IP Address]

ALL

Data type

Project

odps.sql.decimal.odps2

Specifies whether to enable DECIMAL(precision,scale) in the MaxCompute V2.0 data type edition. For more information about data types, see MaxCompute V2.0 data type edition.

  • True: enabled.

  • False: disabled.

ALL

Data storage

Project

odps.timemachine.retention.days

The retention period for backup data in the unit of days. For more information about data backup, see Backup and restoration.

Valid values: 0 to 30. Default value: 1.

ALL

Metadata

Project

odps.schema.evolution.enable

Specifies whether to enable schema evolution.

  • True: enabled.

  • False (default): disabled.

Schema evolution

Query flag parameter values

You can query values of flag parameters at the project level or session level.

  • Project level: To query values of all flag parameters for a project, run the setproject; command on the MaxCompute client (odpscmd).

  • Session level: To query values of flag parameters for a session, use LogView.