You can execute the SET statement to configure MaxCompute system variables for the current session.
Syntax
SET <KEY>=<VALUE>
Parameters
KEY: the name of the property.
VALUE: the value of the property.
The following table describes common properties at the session level.
Property
Description
Valid values
console.sql.result.instancetunnel
Specifies whether to enable InstanceTunnel. For more information, see Usage notes.
True: enabled.
False: disabled.
odps.stage.mapper.mem
The memory size of each Map worker.
Valid values: 256 to 12288.
Default value: 1024. Unit: MB.
odps.stage.reducer.mem
The memory size of each Reduce worker.
Valid values: 256 to 12288.
Default value: 1024. Unit: MB.
odps.stage.joiner.mem
The memory size of each Join worker.
Valid values: 256 to 12288.
Default value: 1024. Unit: MB.
odps.stage.mem
The total memory size of all workers in a specified MaxCompute job. This property has a lower priority than the odps.stage.mapper.mem, odps.stage.reducer.mem, and odps.stage.joiner.mem properties.
Valid values: 256 to 12288.
No default value. Unit: MB.
odps.stage.mapper.split.size
The input data amount of each Map worker, which is the split size of the input file. You can use this property to indirectly control the number of workers at each Map stage.
NoteIf the LIMIT keyword is used in SQL statements, the number of concurrent workers is limited to 1. This way, only a single worker is allowed to run at a time. In this case, when you configure this property, we recommend that you do not add the LIMIT keyword to SQL statements.
Default value: 256. Unit: MB.
odps.stage.reducer.num
The number of workers at each Reduce stage.
NoteIf the LIMIT keyword is used in SQL statements, the number of concurrent workers is limited to 1. This way, only a single worker is allowed to run at a time. In this case, when you configure this property, we recommend that you do not add the LIMIT keyword to SQL statements.
Valid values: 1 to 10000.
odps.stage.joiner.num
The number of workers at each Join stage.
NoteIf the LIMIT keyword is used in SQL statements, the number of concurrent workers is limited to 1. This way, only a single worker is allowed to run at a time. In this case, when you configure this property, we recommend that you do not add the LIMIT keyword to SQL statements.
Valid values: 1 to 10000.
odps.stage.num
The total number of concurrent workers in a specified MaxCompute job. This parameter has a lower priority than the odps.stage.mapper.split.size, odps.stage.reducer.num, and odps.stage.joiner.num parameters.
NoteIf the LIMIT keyword is used in SQL statements, the number of concurrent workers is limited to 1. This way, only a single worker is allowed to run at a time. In this case, when you configure this property, we recommend that you do not add the LIMIT keyword to SQL statements.
Valid values: 1 to 10000.
odps.sql.reshuffle.dynamicpt
Specifies whether to enable dynamic partitioning. Dynamic partitioning prevents the generation of a large number of small files.
True: enabled.
False: disabled.
NoteIf a small number of dynamic partitions are generated, we recommend that you set this property to False to prevent data skew issues.
odps.sql.type.system.odps2
Specifies whether to enable the MaxCompute V2.0 data type edition. For more information about the MaxCompute V2.0 data type edition, see MaxCompute V2.0 data type edition.
True: enabled.
False: disabled.
odps.sql.hive.compatible
Specifies whether to enable the Hive-compatible data type edition. MaxCompute supports Hive syntaxes, such as
inputRecordReader
,outputRecordReader
, andSerde
, only after the Hive-compatible data type edition is enabled. For more information about the Hive-compatible data type edition, see Hive-compatible data type edition.True: enabled.
False: disabled.
odps.sql.executionengine.coldata.deep.buffer.size.max
The cache size requested to store a column that contains data of complex types when MaxCompute writes data to a table. This improves the write performance.
You can configure this property if the output table contains excessive data of complex types or the size of a variable of a specified complex type in the output table is too large.
If the schemas of three columns in the output table are of complex data types, such as STRING, MAP, STRUCT, ARRAY, or BINARY, MaxCompute automatically reserves 192 MB (64 × 3) of memory for table write operations. The cache requested for each column is used to store data in the row that is specified by
batch row count
.If the estimated size of the variables of complex types in the table is small, we recommend that you set this property to a small value. For example, if the size of each variable of a specific complex type does not exceed 1,024 bytes and the default value (1024) of
batch row count
is used, you can set the value of this property to 1048576 (1024 × 1024). Sample statement:set odps.sql.executionengine.coldata.deep.buffer.size.max=1048576;
If the estimated size of each variable of a complex type is between 7 MB and 8 MB, and
batch row count
is set to 32, you can adjust the value of this property to 256 MB (8 × 32).If the output table contains variables of complex types or the small table involved in the
MAPJOIN
operation contains data of complex types, the memory used to run jobs may be adversely affected when you adjust the value of this property. Based on the preceding calculation method, a large value of this property may cause an out of memory (OOM) error.
Unit: bytes. Default value: 67108864.
odps.sql.udf.getjsonobj.new
Specifies whether the GET_JSON_OBJECT function retains the original string when it returns a value.
From April 20, 2021, the GET_JSON_OBJECT function automatically retains the original strings for the returned results in new MaxCompute projects. For MaxCompute projects created before April 20, 2021, the GET_JSON_OBJECT function automatically returns JSON-formatted reserved characters by using escape characters. This prevents the impact on existing jobs.
For more information about the GET_JSON_OBJECT function, see GET_JSON_OBJECT.
True: enabled.
False: disabled.
odps.sql.udf.jvm.memory
The maximum memory size of the Java Virtual Machine (JVM) heap for a user-defined function (UDF).
If a large amount of data is calculated and sorted in the memory by using a UDF, the OOM error may occur. In this case, you can increase the value of this property. However, you must optimize UDF code based on your business requirements to prevent OOM errors.
Valid values: 256 to 12288.
Default value: 1024. Unit: MB.
odps.function.timeout
The timeout period of a UDF.
Valid values: 0 to 3600.
Default value: 600. Unit: seconds.
LabelSecurity
Specifies whether to enable LabelSecurity. For more information about LabelSecurity, see Label-based access control.
True: enabled.
False: disabled.
odps.sql.session.resources
The resources that are referenced by user-defined types (UDTs). You can specify multiple resources and separate them with commas (,). For more information about how to reference resources, see Overview.
The uploaded resources.
odps.sql.udt.display.tostring
Specifies whether to enable the
java.util.Objects.toString(...)
mechanism on the Wrap for all the columns whose outputs are UDTs.True: enabled.
False: disabled.
odps.sql.session.java.imports
The Java packages that are referenced by UDTs. You can specify multiple Java packages and separate them with commas (,). For more information about how to reference Java packages, see Overview.
The uploaded Java packages.
CheckPermissionUsingACL
Specifies whether to enable ACL-based access control. For more information about ACL-based access control, see MaxCompute permissions.
True: enabled.
False: disabled.
CheckPermissionUsingPolicy
Specifies whether to enable policy-based access control. For more information about policy-based access control, see Policy-based access control.
True: enabled.
False: disabled.
ObjectCreatorHasAccessPermission
Specifies whether creators of objects can perform operations on the objects created by themselves.
True: Creators of objects can perform operations on the objects that are created by themselves.
False: Creators of objects cannot perform operations on the objects created by themselves.
ObjectCreatorHasGrantPermission
Specifies whether creators of objects can authorize other users to perform operations on the objects.
True: Creators of objects can authorize other users to perform operations on the objects.
False: Creators of objects cannot authorize other users to perform operations on the objects.
ProjectProtection
Specifies whether to enable project data protection. For more information about the project data protection mechanism, see Project data protection.
True: enabled.
False: disabled.
odps.output.field.formatter
The dynamic data masking rule for SQL query results. For more information about dynamic data masking, see Dynamic data masking.
The custom dynamic data masking rule.
odps.isolation.session.enable
Specifies whether to enable dynamic data masking for a single session. For more information about dynamic data masking, see Dynamic data masking.
True: enabled.
False: disabled.
odps.sql.skewjoin
Specifies whether to enable the SKEWJOIN feature to prevent long tails.
True: enabled.
False: disabled.
odps.sql.skewinfo
The key and value on which SKEWJOIN is executed. For more information, see Data skew tuning.
-
odps.sql.udf.ppr.deterministic
Specifies whether to enable partition pruning for UDFs. For more information about partition pruning, see the "WHERE clause (where_condition)" section in SELECT syntax.
True: enabled.
False: disabled.
odps.sql.udf.ppr.to.subquery
Specifies whether to ignore the error that occurs during result backfilling when partition pruning is being performed. For more information about partition pruning, see the "WHERE clause (where_condition)" section in SELECT syntax.
True: enabled.
False: disabled.
odps.optimizer.enable.range.partial.repartitioning
Specifies whether to enable the Shuffle Remove feature for range-clustered tables.
True: enabled.
False: disabled.
odps.optimizer.skew.join.topk.num
The number of hot key values obtained by the optimizer when it performs an Aggregate operation. For more information, see SKEWJOIN HINT.
-
odps.optimizer.stat.collect.auto
Specifies whether to enable the Freeride feature. After the Freeride feature is enabled, the column stats metrics of tables are automatically collected. For more information, see Optimizer.
True: enabled.
False: disabled.
odps.optimizer.stat.collect.plan
The collection plan that is used to collect specific column stats metrics of specific columns. For more information, see Optimizer.
-
odps.sql.executionengine.batch.rowcount
The number of data rows that can be processed by the SQL engine at a time. Default value: 1024. If an OOM error occurs because of a large field in a row or if a timeout occurs due to poor UDF performance, you must set this property to a smaller value. We recommend that you do not set this property to an excessively small value because an excessively small value may adversely affect the performance of the SQL engine.
Valid values: 1 to 1024.
odps.sql.external.net.vpc
Specifies whether to enable virtual private cloud (VPC) for external tables. For more information, see Hologres external tables.
True: enabled.
False: disabled.
odps.sql.groupby.position.alias
Specifies whether to process integer constants in the
GROUP BY
clause as column numbers in a SELECT statement.True: enabled.
False: disabled.
odps.sql.groupby.skewindata
Specifies whether to enable the anti-skew feature for the
GROUP BY
clause.True: enabled.
False: disabled.
odps.sql.orderby.position.alias
Specifies whether to process integer constants in the
ORDER BY
clause as column numbers in a SELECT statement.True: enabled.
False: disabled.
odps.sql.groupby.orderby.position.alias
Specifies whether to process integer constants in the
GROUP BY
andORDER BY
clauses as column numbers in a SELECT statement. We recommend that you specify this property.NoteIf you set this property to True for projects in which jobs are running, the job parsing and running may be adversely affected. Therefore, you must make sure that existing jobs can still run properly based on the original logic when you set the property to True. Otherwise, specify this property for sessions.
True: enabled.
False: disabled.
odps.sql.mapjoin.memory.max
The size of small tables that are read to the memory when the MAPJOIN statement is executed. Unit: MB.
NoteThis parameter is also subject to the total memory global limit at the task level. For multiple small tables within the same task, the total memory global upper limit equals
std::min(8G, odps.sql.mapjoin.memory.max * the number of small tables in the same task)
.Valid values: 0 to 8192.
odps.sql.metering.value.max
The upper limit on resources consumed by an SQL statement. For more information, see Consumption control.
-
odps.sql.python.version
The Python version on which SQL statements are executed.
cp27
cp37
odps.sql.select.output.format
Specifies whether to display the table headers in the returned results of the MaxCompute client. For more information, see MaxCompute client (odpscmd).
{""needHeader"":false,""fieldDelim"":""""}
: The table headers are not displayed.{""needHeader"":true,""fieldDelim"":""""}
: The table headers are displayed.
odps.sql.timezone
The time zone of the MaxCompute project that you accessed. For more information about time zones, see Time zone configuration operations.
-
odps.sql.unstructured.data.oss.use.https
Specifies whether to enable the mechanism of obtaining data by using HTTPS at the underlying layer when you access external tables. For more information, see Create an OSS external table.
True: enabled.
False: disabled.
odps.sql.unstructured.oss.commit.mode
Specifies whether to enable the multipart upload feature of Object Storage Service (OSS) to write data to OSS external tables. For more information, see Write data to OSS.
True: enabled.
False: disabled.
odps.sql.decimal.tostring.trimzero
Specifies whether to remove the zero at the end of the decimal point when the data type is DECIMAL. The default value is
True
, which indicates to remove the zero at the end of the decimal point.True: The zero at the end of the decimal point is removed.
False: The zero at the end of the decimal point is retained.
odps.sql.unstructured.tablestore.put.row
Specifies whether to call the PutRow operation to write data to a Tablestore external table.
NoteFor more information about PutRow, see PutRow.
True: enabled.
False: disabled.
odps.sql.unstructured.external.max.dop
The maximum concurrency of workers that are accessed from an external tables.
The value of this property must be a positive integer.
odps.sql.executionengine.enable.string.to.date.full.format
Specifies whether to convert a date string that contains the hour, minute, and second parts.
True: enabled.
False: enabled.
Default value: False.
odps.sql.executionengine.enable.rand.time.seed
Specifies whether the RAND function uses the current system time or the current instance ID as the seed to generate random numbers.
True: The RAND function uses the current system time as the seed to generate random numbers. However, the RAND function no longer ensures idempotence and cannot be used as the shuffle key. As a result, rerun results may be different.
False: The RAND function uses the current instance ID as the seed to generate random numbers. The RAND function can ensure idempotence.
Default value: False.
odps.sql.type.json.enable
Specifies whether to enable JSON data types. For more information about JSON data types, see Instructions for using JSON data types in MaxCompute.
True: enabled.
False: disabled.
odps.ext.oss.orc.native
Specifies whether to use the native ORC reader to read tables.
True: The native ORC reader is used.
False: The native ORC reader is not used.
odps.sql.job.max.time.hours
The maximum duration in which a single job runs.
NoteIf you need to configure the property at the project level, you can click the link or search for the DingTalk group (ID: 11782920) to contact MaxCompute technical support.
Valid values: 1 to 72. Unit: hours.
Default value: 24.
odps.task.wlm.quota
The name of the quota that is used to run tasks in the current session. For more information about quotas, see Manage quotas in the new MaxCompute console.
-
NoteIf you need to query the MaxCompute system variables that are configured for the current session, you can view them in the SourceXML of LogView. For more information, see Use LogView 2.0 to view job running information.
Examples
-- Set the size of data read by each Mapper to 256 MB.
SET odps.stage.mapper.split.size=256;
Related statements
SHOW: This statement is used to display the properties that you configured by using the SET statement.