Complex type functions - MaxCompute - Alibaba Cloud Documentation Center

You can use complex type functions in MaxCompute SQL to process data of complex data types, such as ARRAY, MAP, STRUCT, and JSON. This topic describes the syntax and parameters of complex type functions that are supported by MaxCompute SQL, and provides examples on how to use complex type functions. This topic guides you through data development by using complex type functions.

The following table describes the complex type functions that are supported by MaxCompute SQL. For more information about the limits on JSON functions, see Limits.

Function type	Function	Description
ARRAY functions	ALL_MATCH	Checks whether all elements in an array meet a specific condition.
	ANY_MATCH	Checks whether an element in an array meets a specific condition.
	ARRAY	Creates an array based on given values.
	ARRAY_CONTAINS	Checks whether an array contains a given value.
	ARRAY_DISTINCT	Removes duplicate elements from an array.
	ARRAY_EXCEPT	Finds the elements that exist in Array A but do not exist in Array B and returns the elements as a new array without duplicates.
	ARRAY_INTERSECT	Calculates the intersection of two arrays.
	ARRAY_JOIN	Concatenates the elements in an array by using a delimiter.
	ARRAY_MAX	Returns the largest element in an array.
	ARRAY_MIN	Returns the smallest element in an array.
	ARRAY_NORMALIZE	Returns an array whose elements are normalized based on the specified p norm.
	ARRAY_POSITION	Returns the position of the first occurrence of a given element in an array.
	ARRAY_REDUCE	Aggregates the elements in an array.
	ARRAY_REMOVE	Removes a given element from an array.
	ARRAY_REPEAT	Returns a new array in which a given element is repeated several times.
	ARRAY_SORT	Sorts the elements in an array based on a comparator.
	ARRAY_UNION	Calculates the union of two arrays and returns the union as a new array without duplicates.
	ARRAYS_OVERLAP	Checks whether two arrays contain the same element.
	ARRAYS_ZIP	Merges multiple arrays.
	COMBINATIONS	Returns an array of n-element subsets based on the elements in an input array.
	CONCAT	Concatenates multiple arrays or strings.
	EXPLODE	Transposes one row of data into multiple rows. This function is a user-defined table-valued function (UDTF).
	FILTER	Filters the elements in an array.
	FLATTEN	Converts multiple arrays of the ARRAY data type into a single array.
	INDEX	Returns the element at a specific position in an array.
	NGRAMS	Returns an n-gram array of the elements of the specified array.
	POSEXPLODE	Converts an array into a table that has two columns. The first column lists the position of each element in the array, starting from 0. The second column lists the elements.
	REVERSE	Returns an array in reverse order of the elements of a specific array.
	SEQUENCE	Returns an array that contains the specified elements based on expressions.
	SHUFFLE	Returns elements of an array in random order.
	SIZE	Returns the number of elements in an array.
	SLICE	Copies the elements in an array from a specific position based on a specific length and returns the elements as a new array.
	SORT_ARRAY	Sorts the elements in an array.
	SPLIT	Splits a string with a specified delimiter and returns an array.
	TRANSFORM	Transforms the elements in an array.
	ZIP_WITH	Merges two arrays at the element level based on element positions and returns a new array.
MAP functions	EXPLODE	Transposes one row of data into multiple rows. This function is a UDTF.
	INDEX	Returns the value that meets a specific condition in a map.
	MAP	Creates a map based on given key-value pairs.
	MAP_CONCAT	Returns the union of multiple maps.
	MAP_ENTRIES	Converts key-value pairs in a map into a struct array.
	MAP_FILTER	Filters the elements in a map.
	MAP_FROM_ARRAYS	Creates a map based on given arrays.
	MAP_FROM_ENTRIES	Creates a map based on given struct arrays.
	MAP_KEYS	Returns all keys in a map as an array.
	MAP_VALUES	Returns all values in a map as an array.
	MAP_ZIP_WITH	Merges two given maps into a single map.
	MULTIMAP_FROM_ENTRIES	Returns a map that consists of the keys in a struct array and an array containing all values.
	SIZE	Returns the number of key-value pairs in a map.
	TRANSFORM_KEYS	Transforms the keys in a map by using a given function. The values in the map are not changed.
	TRANSFORM_VALUES	Transforms the values in a map by using a given function. The keys in the map are not changed.
STRUCT functions	FIELD	Obtains the value of a member variable in a struct.
	INLINE	Expands a given struct array. Each array element corresponds to a row and each struct element corresponds to a column in each row.
	NAMED_STRUCT	Creates a struct based on given name-value pairs.
	STRUCT	Creates a struct based on a given value list.
JSON functions	FROM_JSON	Returns data of the ARRAY, MAP, or STRUCT type based on a given JSON string and a given output format.
	GET_JSON_OBJECT	Extracts a single string from a standard JSON string by using a specific method.
	JSON_TUPLE	Extracts strings from a standard JSON string based on a set of input keys.
	TO_JSON	Converts data of a complex data type into a JSON string.
	JSON_OBJECT	Returns a JSON object that contains key-value pairs.
	JSON_ARRAY	Evaluates a possibly empty list of values and returns a JSON array that contains these values. Evaluates a possibly empty list of values and returns a JSON array that contains these values.
	JSON_EXPLODE	Expands each element in a JSON array or JSON object into multiple output rows.
	JSON_EXTRACT	Parses the value of json_path in a JSON expression. Note that an error is returned if the value of json_path is invalid.
	JSON_EXISTS	Determines whether the JSON value of json_path exists.
	JSON_PRETTY	Returns a JSON value in a format that is easy to read by adding line breaks and spaces.
	JSON_TYPE	Returns the data type of a JSON value.
	JSON_FORMAT	Converts a value of the JSON data type into a value of the STRING data type. By default, JSON data is not automatically prettified.
	JSON_PARSE	Converts a value of the STRING data type into a value of the JSON data type. If a non-JSON-formatted value is converted into a value of the STRING data type, an error is returned.
	JSON_VALID	Determines whether a string is in a valid JSON format.
	CAST	Supports conversion between basic data types and JSON data types.

Note

In the following examples, the combination of a hyphen and a closing angle bracket (->) is used. For more information about how to use the combination of a hyphen and a closing angle bracket (->) in Lambda functions, see Lambda functions.

ALL_MATCH

Syntax

boolean all_match(array<T> <a>, function<T, boolean> <predicate>)

Description
Checks whether all elements in Array a meet the predicate condition.
Parameters
- a: required. This parameter specifies an array. T in array<T> specifies the data type of the elements in the array. The elements can be of any data type.
- predicate: required. This parameter specifies the built-in function, user-defined function, or expression that is used to determine whether all elements in Array a meet requirements. The data type of the input parameter must be the same as the data type of the elements in Array a.
Return value
A value of the BOOLEAN type is returned. The return value varies based on the following rules:
- If all elements in Array a meet the predicate condition or the array is empty, True is returned.
- If one or more elements in Array a do not meet the predicate condition, false is returned.
- If an element in Array a is null and other elements meet the predicate condition, null is returned.
Examples
- Example 1: Check whether all elements in array(4, 5, 6) meet the x x > 3 condition. This condition specifies that all elements are greater than 3. Sample statement:
```
-- The return value is true. 
select all_match(array(4, 5, 6), x -> x>3);
```
- Example 2: The array is empty. Sample statement:
```
-- The return value is true. 
select all_match(array(), x -> x>3);
```
- Example 3: Check whether all elements in array(1, 2, -10, 100, -30) meet the x-> x > 3 condition. Sample statement:
```
-- The return value is false. 
select all_match(array(1, 2, -10, 100, -30), x -> x>3);
```
- Example 4: Check whether all elements in array(10, 100, 30, null) meet the x-> x > 3 condition. Sample statement:
```
-- The return value is null. 
select all_match(array(10, 100, 30, null), x -> x>3);
```

ANY_MATCH

Syntax

boolean any_match(array<T> <a>, function<T, boolean> <predicate>)

Description
Checks whether an element in Array a meets the predicate condition.
Parameters
- a: required. This parameter specifies an array. T in array<T> specifies the data type of the elements in the array. The elements can be of any data type.
- predicate: required. This parameter specifies the built-in function, user-defined function, or expression that is used to determine whether all elements in Array a meet requirements. The data type of the input parameter must be the same as the data type of the elements in Array a.
Return value
A value of the BOOLEAN type is returned. The return value varies based on the following rules:
- If one or more elements in Array a meet the predicate condition, true is returned.
- If no elements in Array a meet the predicate condition or the array is empty, false is returned.
- If an element in Array a is null and other elements do not meet the predicate condition, null is returned.
Examples
- Example 1: Check whether an element in array(1, 2, -10, 100, -30) meets the x-> x > 3 condition. Sample statement:
```
-- The return value is true. 
select any_match(array(1, 2, -10, 100, -30), x-> x > 3);
```
- Example 2: The array is empty. Sample statement:
```
-- The return value is false. 
select any_match(array(), x-> x > 3);
```
- Example 3: Check whether an element in array(1, 2, -10, -20, -30) meets the x-> x > 3 condition. Sample statement:
```
-- The return value is false. 
select any_match(array(1, 2, -10, -20, -30), x-> x > 3);
```
- Example 4: Check whether an element in array(1, 2, null, -10) meets the x-> x > 3 condition. Sample statement:
```
-- The return value is null. 
select any_match(array(1, 2, null, -10), x-> x > 3);
```

ARRAY

Syntax
```
array array(<value>,<value>[, ...])
```
Description
Creates an array based on given values.
Parameters
value: required. A value of any data type. All values must be of the same data type.
Return value
A value of the ARRAY type is returned.

Examples

Create an array based on the data in the t_table table that contains the c1 (BIGINT), c2 (STRING), c3 (STRING), c4 (BIGINT), and c5 (BIGINT) columns. Data in the table:

+------------+----+----+------------+------------+
| c1         | c2 | c3 | c4         | c5         |
+------------+----+----+------------+------------+
| 1000       | k11 | k21 | 86         | 15         |
| 1001       | k12 | k22 | 97         | 2          |
| 1002       | k13 | k23 | 99         | 1          |
+------------+----+----+------------+------------+

Sample statement:

-- Create an array based on the data in the c2, c4, c3, and c5 columns. 
select array(c2,c4,c3,c5) from t_table;

-- The following result is returned: 
+------+
| _c0  |
+------+
| [k11, 86, k21, 15] |
| [k12, 97, k22, 2] |
| [k13, 99, k23, 1] |
+------+

ARRAY_CONTAINS

Syntax

boolean array_contains(array<T> <a>, value <v>)

Description
Checks whether Array a contains Element v.
Parameters
- a: required. This parameter specifies an array. T in array<T> specifies the data type of the elements in the array. The elements can be of any data type.
- v: required. This parameter specifies the element that you want to check. The value of v must be of the same data type as the elements in Array a.
Return value
A value of the BOOLEAN type is returned.

Examples

The t_table_array table contains the c1 (BIGINT) and t_array (ARRAY<STRING>) columns. Data in the table:

+------------+---------+
| c1         | t_array |
+------------+---------+
| 1000       | [k11, 86, k21, 15] |
| 1001       | [k12, 97, k22, 2] |
| 1002       | [k13, 99, k23, 1] |
+------------+---------+

Sample statement:

-- Check whether the t_array column contains the value 1. 
select c1, array_contains(t_array,'1') from t_table_array;
-- The following result is returned: 
+------------+------+
| c1         | _c1  |
+------------+------+
| 1000       | false |
| 1001       | false |
| 1002       | true |
+------------+------+

ARRAY_DISTINCT

Syntax
```
array<T> array_distinct(array<T> <a>)
```
Description
Removes duplicate elements from Array a.
Parameters
a: required. This parameter specifies an array. T in array<T> specifies the data type of the elements in the array. The elements can be of any data type.
Return value
A value of the ARRAY type is returned. The return value varies based on the following rules:
- The returned array has no duplicate elements, and the elements are sorted in the same order as Array a.
- If an element in Array a is null, the null element is involved in the operation.
- If the input array is empty, an empty array is returned.
Examples
- Example 1: Remove duplicate elements from array(10, 20, 30, 30, 20, 10). Sample statement:
```
-- The return value is [10,20,30]. 
select array_distinct(array(10, 20, 30, 30, 20, 10));
```
- Example 2: Remove duplicate elements from array(10, 20, 20, null, null, 30, 20, null). Sample statement:
```
-- The return value is [10,20,null,30]. 
select array_distinct(array(10, 20, 20, null, null, 30, 20, null)); 
```
- Example 3: Remove duplicate elements from an empty array. Sample statement:
```
-- The return value is []. 
select array_distinct(array());
```

ARRAY_EXCEPT

Syntax

array<T> array_except(array<T> <a>, array<T> <b>)

Description
Finds the elements that exist in Array a but do not exist in Array b and returns the elements as a new array without duplicates.
Parameters
a and b: required. These parameters specify arrays. T in array<T> specifies the data type of the elements in the arrays. The elements can be of any data type. The elements in Array a and the elements in Array b must be of the same data type.
Return value
A value of the ARRAY type is returned. The return value varies based on the following rules:
- The returned array has no duplicate elements, and the elements are sorted in the same order as Array a.
- If an element in an array is null, the null element is involved in the operation.
- If one of the input arrays is empty, a new array is returned based on the non-empty array without duplicates.
- If the two input arrays are empty, an empty array is returned.
Examples
- Example 1: Find the elements that exist in array(1, 1, 3, 3, 5, 5) but do not exist in array(1, 1, 2, 2, 3, 3) and return the elements as a new array without duplicates. Sample statement:
```
-- The return value is [5]. 
select array_except(array(1, 1, 3, 3, 5, 5), array(1, 1, 2, 2, 3, 3));
```
- Example 2: Find the elements that exist in array(1, 1, 3, 3, 5, 5, null, null) but do not exist in array(1, 1, 2, 2, 3, 3) and return the elements as a new array without duplicates. Sample statement:
```
-- The return value is [5,null]. 
select array_except(array(1, 1, 3, 3, 5, 5, null, null), array(1, 1, 2, 2, 3, 3));
```
- Example 3: Find the elements that exist in an array but do not exist in another array and return the elements as a new array without duplicates. One of the input arrays is empty. Sample statement:
```
-- The return value is [2,1]. 
select array_except(array(2, 1, 1, 2), cast(array() as array<int>)); 
```
- Example 4: Find the elements that exist in an array but do not exist in another array and return the elements as a new array without duplicates. The two input arrays are empty. Sample statement:
```
-- The return value is []. 
select array_except(cast(array() as array<int>), cast(array() as array<int>));
```

ARRAY_INTERSECT

Syntax

array<T> array_intersect(array<T> <a>, array<T> <b>)

Description
Calculates the intersection of Array a and Array b and removes duplicate elements.
Parameters
a and b: required. These parameters specify arrays. T in array<T> specifies the data type of the elements in the arrays. The elements can be of any data type. The elements in Array a and the elements in Array b must be of the same data type.
Return value
A value of the ARRAY type is returned. The return value varies based on the following rules:
- If an element in an array is null, the null element is involved in the operation.
- The returned array has no duplicate elements, and the elements are sorted in the same order as Array a.
- If Array a or Array b is null, null is returned.
Examples
- Example 1: Calculate the intersection of array(1, 2, 3) and array(1, 3, 5), and remove duplicate elements. Sample statement:
```
-- The return value is [1,3]. 
select array_intersect(array(1, 2, 3), array(1, 3, 5));
```
- Example 2: Calculate the intersection of array(10, 20, 20, 30, 30, null, null) and array(30, 30, 20, 20, 40, null, null), and remove duplicate elements. Sample statement:
```
-- The return value is [20,30,null]. 
select array_intersect(array(10, 20, 20, 30, 30, null, null), array(30, 30, 20, 20, 40, null, null)); 
```

ARRAY_JOIN

Syntax

array_join(array<T> <a>, <delimiter>[, <nullreplacement>])

Description
Concatenates the elements in Array a by using a delimiter. If the array contains a null element, use nullreplacement to specify the string that you want to use to replace the null element in the results. If you do not configure nullreplacement, the null element is ignored.
Parameters
- a: required. This parameter specifies an array. T in array<T> specifies the data type of the elements in the array.
  Note
  If the elements in the array are not of the STRING type, MaxCompute converts the data type of the elements into STRING.
- delimiter: required. A value of the STRING type. This parameter specifies the string that is used to separate the concatenated elements in Array a.
- nullreplacement: optional. This parameter specifies the string that is used to replace null elements.
Return value
A value of the STRING type is returned.

Examples

-- The return value is 10,20,20,30. 
select array_join(array(10, 20, 20, null, null, 30), ",");
-- The return value is 10##20##20##null##null##30. 
select array_join(array(10, 20, 20, null, null, 30), "##", "null");

ARRAY_MAX

Syntax
```
T array_max(array<T> <a>) 
```
Description
Returns the largest element in Array a.
Parameters
a: required. This parameter specifies an array. T in array<T> specifies the data type of the elements in the array.
The following data types are supported:
- TINYINT, SMALLINT, INT, and BIGINT
- FLOAT and DOUBLE
- BOOLEAN
- DECIMAL and DECIMALVAL
- DATE, DATETIME, TIMESTAMP, IntervalDayTime, and IntervalYearMonth
- STRING, BINARY, VARCHAR, and CHAR
- ARRAY, STRUCT, and MAP
Return value
The largest element in Array a is returned. The return value varies based on the following rules:
- If Array a is null, null is returned.
- If an element in Array a is null, the null element is not involved in the operation.

Examples

-- The return value is 20. 
select array_max(array(1, 20, null, 3));

ARRAY_MIN

Syntax
```
T array_min(array<T> <a>) 
```
Description
Returns the smallest element in Array a.
Parameters
a: required. This parameter specifies an array. T in array<T> specifies the data type of the elements in the array.
The following data types are supported:
- TINYINT, SMALLINT, INT, and BIGINT
- FLOAT and DOUBLE
- BOOLEAN
- DECIMAL and DECIMALVAL
- DATE, DATETIME, TIMESTAMP, IntervalDayTime, and IntervalYearMonth
- STRING, BINARY, VARCHAR, and CHAR
- ARRAY, STRUCT, and MAP
Return value
The smallest element in Array a is returned. The return value varies based on the following rules:
- If Array a is null, null is returned.
- If an element in Array a is null, the null element is not involved in the operation.

Examples

-- The return value is 1. 
select array_min(array(1, 20, null, 3));

ARRAY_NORMALIZE

Syntax
```
array_normalize(array, p)
```
Description
Returns an array whose elements are normalized based on the specified p norm.
This function is equivalent to TRANSFORM(array, v -> v / REDUCE(array, 0, (a, v) -> a + POW(ABS(v), p), a -> POW(a, 1 / p)). However, the REDUCE function is called only once.
Parameters
array: an input array. The elements in the array can be of only the FLOAT and DOUBLE data types.
p: the p norm of the array.
Return value
An array whose elements are normalized based on the specified p norm is returned.
- If the array is null or contains null elements, null is returned.
- If the p=0 condition is met, the original array is returned. If the p<0 condition is met, an error is returned.

Examples

SELECT  array_normalize(array(10.0, 20.0, 50.0), 1.0);

The following result is returned:

[0.125, 0.25, 0.625]

ARRAY_POSITION

Syntax

bigint array_position(array<T> <a>, T <element>)

Description
Returns the position of the first occurrence of a given element in Array a. The position numbers of elements are counted from left to right and start from 1.
Parameters
- a: required. This parameter specifies an array. T in array<T> specifies the data type of the elements in the array. The following data types are supported:
  - TINYINT, SMALLINT, INT, and BIGINT
  - FLOAT and DOUBLE
  - BOOLEAN
  - DECIMAL and DECIMALVAL
  - DATE, DATETIME, TIMESTAMP, IntervalDayTime, and IntervalYearMonth
  - STRING, BINARY, VARCHAR, and CHAR
  - ARRAY, STRUCT, and MAP
- element: required. The element whose position you want to query. The data type of this parameter must be the same as the data type of the elements in Array a.
Return value
A value of the BIGINT type is returned. The return value varies based on the following rules:
- If Array a or element is null, null is returned.
- If the specific element is not found, 0 is returned.
Examples
- Example 1: Return the position of the first occurrence of 1 in array(3, 2, 1). Sample statement:
```
-- The return value is 3. 
select array_position(array(3, 2, 1), 1);
```
- Example 2: The specified element is null. Sample statement:
```
-- The return value is null. 
select array_position(array(3, 1, null), null);
```

ARRAY_REDUCE

Syntax

R array_reduce(array<T> <a>, buf <init>, function<buf, T, buf> <merge>, function<buf, R> <final>)

Description
Aggregates the elements in Array a.
Parameters
- a: required. This parameter specifies an array. T in array<T> specifies the data type of the elements in the array. The elements can be of any data type.
- init: required. The initial value of the intermediate result that is used to aggregate elements.
- merge: required. A built-in function, user-defined function, or expression that is used to perform an operation on each element in Array a and the intermediate result. This function or expression uses the elements of Array a and the init parameter as input parameters.
- final: required. A built-in function, user-defined function, or expression that is used to convert the intermediate result into the final result. This function or expression uses the result of merge as the input parameter. R specifies the data type of the output.
Return value
The data type of the return value is the same as the data type that is specified for final.

Examples

-- The return value is 6. 
select array_reduce(array(1, 2, 3), 0, (buf, e)->buf + e, buf->buf);
-- The return value is 2.5. 
select array_reduce(array(1, 2, 3, 4), named_struct('sum', 0, 'count', 0), (buf, e)->named_struct('sum', buf.sum + e, 'count', buf.count + 1), buf -> buf.sum / buf.count);

ARRAY_REMOVE

Syntax

array<T> array_remove(array<T> <a>, T <element>)

Description
Removes a given element from Array a.
Parameters
- a: required. This parameter specifies an array. T in array<T> specifies the data type of the elements in the array. The following data types are supported:
  - TINYINT, SMALLINT, INT, and BIGINT
  - FLOAT and DOUBLE
  - BOOLEAN
  - DECIMAL and DECIMALVAL
  - DATE, DATETIME, TIMESTAMP, IntervalDayTime, and IntervalYearMonth
  - STRING, BINARY, VARCHAR, and CHAR
  - ARRAY, STRUCT, and MAP
- element: required. The element that you want to remove. The data type of this parameter must be the same as the data type of the elements in Array a.
Return value
A value of the ARRAY type is returned. The return value varies based on the following rules:
- If an element in Array a is null, the null element is not involved in the operation.
- If Array a or element is null, null is returned.
- If Array a does not contain the specific element, Array a is returned.

Examples

Example 1: Remove 1 from array(3, 2, 1). Sample statement:

-- The return value is [3,2]. 
select array_remove(array(3, 2, 1), 1);

Example 2: The specified element is null. Sample statement:

-- The return value is null. 
select array_remove(array(3, 1, null), null);

Example 3: Remove 2 from array(3, 1, null). Sample statement:

-- The return value is [3,1,null]. 
select array_remove(array(3, 1, null), 2);

ARRAY_REPEAT

Syntax

array<T> array_repeat(T <element>, int <count>)

Description
Returns a new array in which element t is repeated count times.
Parameters
- t: required. This parameter specifies the element that you want to repeat. The following data types are supported:
  - TINYINT, SMALLINT, INT, and BIGINT
  - FLOAT and DOUBLE
  - BOOLEAN
  - DECIMAL and DECIMALVAL
  - DATE, DATETIME, TIMESTAMP, IntervalDayTime, and IntervalYearMonth
  - STRING, BINARY, VARCHAR, and CHAR
  - ARRAY, STRUCT, and MAP
- count: required. This parameter specifies the number of repetitions. A value of the INT type is required. The value must be greater than or equal to 0.
Return value
A value of the ARRAY type is returned. The return value varies based on the following rules:
- If the value of count is null, null is returned.
- If the value of count is less than 0, an empty array is returned.
Examples
- Example 1: Repeat 123 twice and return the new array. Sample statement:
```
-- The return value is [123, 123]. 
select array_repeat('123', 2);
```
- Example 2: The value of count is null. Sample statement:
```
-- The return value is null. 
select array_repeat('123', null);
```
- Example 3: The value of count is less than 0. Sample statement:
```
-- The return value is []. 
select array_repeat('123', -1);
```

ARRAY_SORT

Syntax

array<T> array_sort(array<T> <a>, function<T, T, bigint> <comparator>)

Description
Sorts the elements in Array a based on a comparator.
Parameters
- a: required. This parameter specifies an array. T in array<T> specifies the data type of the elements in the array. The elements can be of any data type.
- comparator: required. A built-in function, user-defined function (UDF), or expression that is used to compare two elements in the array.
  Processing logic of comparator(a, b): If a is equal to b, 0 is returned. If a is less than b, a negative integer is returned. If a is greater than b, a positive integer is returned. If comparator(a, b) returns null, an error is returned.
  Important
  Comparison functions in ARRAY_SORT must be self-consistent. Examples:
  If compare(a, b) > 0 is met, compare(b, a) < 0 must be met.
  If compare(a, b) = 0 is met, compare(b, a) = 0 must be met.
  If compare(a, b) < 0 is met, compare(b, a) > 0 must be met.
  Examples of functions that are not self-consistent:
  (left, right) -> CASE WHEN left <= right THEN -1L ELSE 0L END:
  If you configure a = 1 and b = 1, the compare(a, b) function returns -1, and the compare(b, a) function also returns -1. The comparison results conflict with each other, and the functions are not self-consistent.
  (left, right) -> CASE WHEN left < right THEN - 1L WHEN left = right THEN 0L ELSE 1L END:
  If you configure a = NULL and b = 1, the compare(a, b) function returns 1 and the compare(b, a) function also returns 1. The comparison results conflict with each other, and the functions are not self-consistent.
Return value
A value of the ARRAY type is returned.

Examples

Example 1: Sort the elements in array(5,6,1).

SELECT array_sort(array(5,6,1), (left,right) -> CASE WHEN left < right THEN -1L WHEN left > right THEN 1L ELSE 0L END);

-- The following result is returned:
+------------+
| _c0        |
+------------+
| [1,5,6]    |
+------------+

Example 2:

SELECT array_sort(a, (a,b)-> CASE WHEN a.a>b.a THEN 1L WHEN a.a<b.a THEN -1L ELSE 0L END) 
FROM VALUES ( 
  array(named_struct('a', 1, 'b', 10),
        named_struct('a', 3, 'b', 11),
        named_struct('a', 2, 'b', 12)))
  AS t(a);
  
-- The following result is returned:
+------+
| _c0 |
+------+
| [{a:1, b:10}, {a:2, b:12}, {a:3, b:11}] |
+------+

ARRAY_UNION

Syntax

array<T> array_union(array<T> <a>,  array<T> <b>)

Description
Calculates the union of Array a and Array b, and removes duplicate elements.
Parameters
a and b: required. These parameters specify arrays. T in array<T> specifies the data type of the elements in the arrays. The elements can be of any data type. The elements in Array a and the elements in Array b must be of the same data type.
The following data types are supported:
- TINYINT, SMALLINT, INT, and BIGINT
- FLOAT and DOUBLE
- BOOLEAN
- DECIMAL and DECIMALVAL
- DATE, DATETIME, TIMESTAMP, IntervalDayTime, and IntervalYearMonth
- STRING, BINARY, VARCHAR, and CHAR
- ARRAY, STRUCT, and MAP
Return value
A value of the ARRAY type is returned. If Array a or Array b is null, null is returned.
Examples
- Example 1: Calculate the union of array(1, 2, 3) and array(1, 3, 5), and remove duplicate elements. Sample statement:
```
-- The return value is [1,2,3,5]. 
select array_union(array(1, 2, 3), array(1, 3, 5));
```
- Example 2: Calculate the union of two arrays and remove duplicate elements. One of the arrays is null. Sample statement:
```
-- The return value is null. 
select array_union(array(1, 2, 3), null);
```

ARRAYS_OVERLAP

Syntax

boolean arrays_overlap(array<T> <a>,  array<T> <b>)

Description
Checks whether Array a and Array b contain the same element.
Parameters
a and b: required. These parameters specify arrays. T in array<T> specifies the data type of the elements in the arrays. The elements can be of any data type. The elements in Array a and the elements in Array b must be of the same data type.
The following data types are supported:
- TINYINT, SMALLINT, INT, and BIGINT
- FLOAT and DOUBLE
- BOOLEAN
- DECIMAL and DECIMALVAL
- DATE, DATETIME, TIMESTAMP, IntervalDayTime, and IntervalYearMonth
- STRING, BINARY, VARCHAR, and CHAR
- ARRAY, STRUCT, and MAP
Return value
A value of the BOOLEAN type is returned. The return value varies based on the following rules:
- If Array a contains at least one element that is in Array b and is not null, true is returned.
- If Array a and Array b do not contain the same element, both of the arrays are not empty, and one or both of the arrays contain a null element, null is returned.
- If Array a and Array b do not contain the same element, and both of the arrays are not empty and do not contain a null element, false is returned.
Examples
- Example 1: Check whether array(1, 2, 3) and array(3, 4, 5) contain the same element. Sample statement:
```
-- The return value is true. 
select arrays_overlap(array(1, 2, 3), array(3, 4, 5));
```
- Example 2: Check whether array(1, 2, 3) and array(6, 4, 5) contain the same element. Sample statement:
```
-- The return value is false. 
select arrays_overlap(array(1, 2, 3), array(6, 4, 5));
```
- Example 3: Check whether two arrays contain the same element. One of the arrays contains the null element. Sample statement:
```
-- The return value is null. 
select arrays_overlap(array(1, 2, 3), array(5, 4, null));
```

ARRAYS_ZIP

Syntax

array<struct<T, U, ...>> arrays_zip(array<T> <a>, array<U> <b>[, ...])

Description
Merges multiple given arrays and returns a struct array, in which the Nth struct contains all the Nth elements of the input arrays.
Parameters
a and b: required. These parameters specify arrays. T in array<T> and U in array<U> specify the data types of the elements in the arrays. The elements can be of any data type.
The following data types are supported:
- TINYINT, SMALLINT, INT, and BIGINT
- FLOAT and DOUBLE
- BOOLEAN
- DECIMAL and DECIMALVAL
- DATE, DATETIME, TIMESTAMP, IntervalDayTime, and IntervalYearMonth
- STRING, BINARY, VARCHAR, and CHAR
- ARRAY, STRUCT, and MAP
Return value
A value of the ARRAY type is returned. The return value varies based on the following rules:
- The Nth struct in the generated struct array contains all the Nth elements of the input arrays. If an array contains less than N elements, null is used as the Nth element of the array.
- If one or more input arrays are null, null is returned.

Examples

Example 1: Merge array(1, 2, 3) and array(2, 3, 4) into a struct array. Sample statement:

-- The return value is [{0:1, 1:2}, {0:2, 1:3}, {0:3, 1:4}]. 
select arrays_zip(array(1, 2, 3), array(2, 3, 4));

Example 2: Merge array(1, 2, 3) and array(4, 5) into a struct array. Sample statement:

-- The return value is [{0:1, 1:4}, {0:2, 1:5}, {0:3, 1:null}]. 
select arrays_zip(array(1, 2, 3), array(4, 5));

COMBINATIONS

Syntax
```
combinations(array(T), n)
```
Description
Returns an array of n-element subsets based on the elements in an input array.
Parameters
array: an input array.
n: the number of elements in each subset.
Return value
Returns an array of n-element subsets based on the elements in an input array.
- If the input array has no duplicate elements, an array of n-element subsets is returned. The subsets are deterministic and listed in random order. Elements in a subset are deterministic and listed in random order.
- By default, the maximum value of n is 5. You can specify the odps.sql.max.combination.length parameter to change the maximum value. By default, the total number of generated subsets cannot exceed 100,000. You can specify the odps.sql.max.combinations parameter to change the total number.
- If the value of n is greater than the number of elements in the input array, an empty array is returned.

Examples

SELECT combinations(array('foo', 'bar', 'boo'),2);

The following result is returned:

[['foo', 'bar'], ['foo', 'boo']['bar', 'boo']]

SELECT combinations(array(1,2,3,4,5),3);

The following result is returned:

[[1, 2, 3], [1, 2, 4], [1, 3, 4], [2, 3, 4], [1, 2, 5], [1, 3, 5], [2, 3, 5], [1, 4, 5], [2, 4, 5], [3, 4, 5]]

SELECT combinations(array(1,2,2),2);

The following result is returned:

[[1,2], [1,2], [2,2]]

CONCAT

Syntax

array<T> concat(array<T> <a>, array<T> <b>[,...])
string concat(string <str1>, string <str2>[,...])

Description
- Arrays as inputs: Concatenates all elements of multiple arrays and returns a new array.
- Strings as inputs: Concatenates multiple strings and returns a new string.
Parameters
- a and b: required. These parameters specify arrays. T in array<T> specifies the data type of the elements in the arrays. The elements can be of any data type. The elements in Array a and the elements in Array b must be of the same data type. The null elements are also involved in the operation.
- str1 and str2: required. Values of the STRING type. If the input values are of the BIGINT, DOUBLE, DECIMAL, or DATETIME type, they are implicitly converted into values of the STRING type before calculation. If the input values are of other data types, an error is returned.
Return value
- A value of the ARRAY type is returned. If one of the input arrays is null, null is returned.
- A value of the STRING type is returned. If no input parameters are configured or an input parameter is set to null, null is returned.

Examples

Example 1: Concatenate all elements of array(10, 20) and array(20, -20). Sample statement:

-- The return value is [10, 20, 20, -20]. 
select concat(array(10, 20), array(20, -20));

Example 2: One of the input arrays contains a null element. Sample statement:

-- The return value is [10, null, 20, -20]. 
select concat(array(10, null), array(20, -20));

Example 3: One of the input arrays is null. Sample statement:

-- The return value is null. 
select concat(array(10, 20), null);

Example 4: Concatenate strings aabc and abcde. Sample statement:

-- The return value is aabcabcde. 
select concat('aabc','abcde');

Example 5: The input is empty. Sample statement:
```
-- The return value is null. 
select concat();
```

Example 6: One of the input strings is null. Sample statement:

-- The return value is null. 
select concat('aabc', 'abcde', null);

EXPLODE

Limits
- A SELECT statement can include only one EXPLODE function. Only data in a column can be calculated by using the EXPLODE function.
- This function cannot be used with the GROUP BY, CLUSTER BY, DISTRIBUTE BY, or SORT BY clause.
Syntax
```
explode (<var>)
```
Description
Transposes one row of data into multiple rows. This function is a UDTF.
- If the parameter value is of the array<T> type, the array stored in the column is transposed into multiple rows.
- If the parameter value is of the map<K, V> type, each key-value pair of the map stored in the column is transposed into one row with two columns. One column is used to store keys, and the other column is used to store values.
Parameters
var: required. The value must be of the array<T> or map<K, V> type.
Return value
Rows after transposition are returned.

Examples

The t_table_map table contains the c1 (BIGINT) and t_map (MAP<STRING,BIGINT>) columns. Data in the table:

  +------------+-------+
| c1         | t_map |
+------------+-------+
| 1000       | {k11:86, k21:15} |
| 1001       | {k12:97, k22:2} |
| 1002       | {k13:99, k23:1} |
+------------+-------+

Sample statement:

select explode(t_map) from t_table_map;
-- The following result is returned: 
+-----+------------+
| key | value      |
+-----+------------+
| k11 | 86         |
| k21 | 15         |
| k12 | 97         |
| k22 | 2          |
| k13 | 99         |
| k23 | 1          |
+-----+------------+

FIELD

Syntax
```
T field(struct <s>, string <fieldName>)
```
Description
Obtains the value of a member variable in a struct.
Parameters
- s: required. This parameter specifies a struct. The struct is in the format of {f1:T1, f2:T2[, ...]}. f1 and f2 specify member variables, T1 specifies the value of f1, and T2 specifies the value of f2.
- fieldName: required. A value of the STRING type. This parameter specifies the member variable in the struct.
Return value
A value of the specific member variable in the struct is returned.

Examples

-- The return value is hello. 
select field(named_struct('f1', 'hello', 'f2', 3), 'f1');

FILTER

Syntax

array<T> filter(array<T> <a>, function<T,boolean> <func>)

Description
Filters the elements in Array a by using func and returns a new array.
Parameters
- a: required. This parameter specifies an array. T in array<T> specifies the data type of the elements in the array. The elements can be of any data type.
- func: required. This parameter specifies the built-in function, user-defined function, or expression that is used to filter the elements in Array a. The value must be of the same data type as the elements in Array a. The output result of the function or expression is of the BOOLEAN type.
Return value
A value of the ARRAY type is returned.

Examples

-- The return value is [2, 3]. 
select filter(array(1, 2, 3), x -> x > 1);

FLATTEN

Syntax
```
flatten(arrayOfArray)
```
Description
Converts multiple arrays of the ARRAY data type into a single array.
Parameters
arrayOfArray: arrays of the ARRAY data type.
Return value
Multiple arrays of the ARRAY data type are converted into a single array in the order of the elements.
- If the input value is null, null is returned.
- If the input parameters are not arrays of the ARRAY data type, an error is returned.

Examples

SELECT flatten(array(array(1, 2), array(3, 4)));

The following result is returned:

[1,2,3,4]

FROM_JSON

Syntax
```
from_json(<jsonStr>, <schema>)
```
Description
Returns data of the ARRAY, MAP, or STRUCT type based on JSON string jsonStr and output format schema.

Parameters

jsonStr: required. The JSON string that you entered.

schema: required. The schema of the JSON string. The value of this parameter must be in the same format as that in the statement for creating a table, such as array<bigint>, map<string, array<string>>, or struct<a:int, b:double, `C`:map<string,string>>.

Note

Keys in a struct are case-sensitive. You can also specify a struct in the format of a BIGINT, b DOUBLE, which is equivalent to STRUCT<a:BIGINT, b:DOUBLE>.

The following table describes the mappings between JSON data types and MaxCompute data types.

JSON data type	MaxCompute data type
OBJECT	STRUCT, MAP, and STRING
ARRAY	ARRAY and STRING
NUMBER	TINYINT, SMALLINT, INT, BIGINT, FLOAT, DOUBLE, DECIMAL, and STRING
BOOLEAN	BOOLEAN and STRING
STRING	STRING, CHAR, VARCHAR, BINARY, DATE, and DATETIME
NULL	All types

Note

The JSON string of the OBJECT and ARRAY types are parsed as much as possible. If the data type of the JSON string is not mapped to any MaxCompute data type, the JSON string is omitted. For ease of use, all JSON data types can be converted into the STRING data type supported by MaxCompute. When you convert a JSON string of the NUMBER type to a value of the FLOAT, DOUBLE, or DECIMAL type, the precision of the value cannot be ensured. We recommend you convert the JSON string to a value of the STRING type and then convert the obtained value to a value of the FLOAT, DOUBLE, or DECIMAL type.

Return value
A value of the ARRAY, MAP, or STRUCT type is returned.

Examples

Example 1: Convert a specific JSON string into a value of a specific data type. Sample statement:

-- The return value is {"a":1,"b":0.8}. 
select from_json('{"a":1, "b":0.8}', 'a int, b double');
-- The return value is {"time":"26/08/2015"}. 
select from_json('{"time":"26/08/2015"}', 'time string');
-- The return value is {"a":1,"b":0.8}. 
select from_json('{"a":1, "b":0.8}', 'a int, b double, c string');
-- The return value is [1,2,3]. 
select from_json('[1, 2, 3, "a"]', 'array<bigint>');
-- The return value is {"d":"v","a":"1","b":"[1,2,3]","c":"{}"}. 
select from_json('{"a":1,"b":[1,2,3],"c":{},"d":"v"}', 'map<string, string>');

Example 2: Use the map_keys and from_json functions to obtain all keys in a JSON string. You can also use JSON_KEYS for the same purpose. Sample statement:
```
-- The return value is ["a","b"]. 
select map_keys(from_json('{"a":1,"b":2}','map<string,string>'));
```

GET_JSON_OBJECT

Usage notes

This function is used to extract a single string from a standard JSON string by using JSONPath. This function supports input parameters of the following data types:

JSON types: If input parameters are of JSON types, standard JSONPath expressions are used.
STRING type: If input parameters are of the STRING type, the original JSONPath expressions are used.

The methods and usage notes for using this function vary based on input parameter data types. This topic describes how to use the GET_JSON_OBJECT function when input parameters are of the JSON and STRING types.

Note

The JSONPath expressions that are used when input parameters are of JSON types and those that are used when input parameters are of the STRING type comply with different rules. This may cause incompatibility issues.
The GET_JSON_OBJECT function does not support the syntax of JSONPath expressions.

Input parameters of JSON data types

Syntax

string get_json_object(json <json>, string <json_path>)

Description
Extracts a single string from a standard JSON string based on JSON PATH.
Parameters
- json: required. A JSON string from which you want to extract a single string.
- json_path: required. A JSONPath expression based on which you want to extract a single string.
Return value
A value of a STRING type is returned.
Examples
- Example 1: Extract the value that corresponds to the key a from a JSON string.
```
select get_json_object(json '{"a":1, "b":2}', '$.a');
```
  The following result is returned:
```
+-----+
| _c0 |
+-----+
| 1   |
+-----+
```
- Example 2: Extract the value that corresponds to the key c from a JSON string.
```
select get_json_object(json '{"a":1, "b":2}', '$.c');
```
  The following result is returned:
```
+-----+
| _c0 |
+-----+
| NULL |
+-----+
```
- Example 3: If an invalid JSON path is specified, the return value is NULL.
```
select get_json_object(json '{"a":1, "b":2}', '$invalid_json_path');
```
  The following result is returned:
```
+-----+
| _c0 |
+-----+
| NULL |
+-----+
```

Input parameter of the STRING data type

Syntax

string get_json_object(string <json>, string <path>)

Description
Extracts a single string from a standard JSON string based on path. The original data is read each time this function is called. Therefore, repeated calls may affect system performance and increase costs. To prevent repeated calls, you can use the GET_JSON_OBJECT function with UDTFs. For more information, see Convert JSON log data by using MaxCompute built-in functions and UDTFs.
Parameters
- json: required. A value of the STRING type. This parameter specifies a standard JSON object in the format of {Key:Value, Key:Value,...}. If the string contains a double quotation mark ("), use two backslashes (\\) to escape the double quotation mark (") before extraction. If the string contains a single quotation mark ('), use a single backslash (\) to escape the single quotation mark (') before extraction.
- path: required. A value of the STRING type. This parameter specifies the path in the value of the json parameter and starts with $. For more information about the path parameter, see LanguageManual UDF. For more information about best practices, see Migrate JSON-formatted data from OSS to MaxCompute. Meanings of different characters:
  - $: indicates the root node.
  - . or ['']: indicates a child node. MaxCompute parses JSON objects by using . or ['']. If a key in a JSON object contains a period (.), [''] can be used.
  - [] ([number]): indicates an array subscript, which starts from 0.
  - *: indicates the wildcard for []. If this character is used in the path parameter, an entire array is returned. An asterisk (*) cannot be escaped.
Limits
Only MaxCompute V2.0 allows you to extract data by using [''] in the path parameter. To use [''], you must add the set odps.sql.udf.getjsonobj.new=true; statement before the statement that you want to execute.
Return value
- If the json parameter is empty or invalid, null is returned.
- If the format of json is valid and path exists, the related string is returned.
- You can specify the flagodps.sql.udf.getjsonobj.new parameter for a session to determine how this function returns a value.
  - If you execute the set odps.sql.udf.getjsonobj.new=true; statement, this function retains the original strings when it returns a value.
    We recommend that you use this configuration because it results in more standard function return behavior. This facilitates data processing and improves data processing performance. If a job in which this function escapes JSON reserved characters exists in a MaxCompute project, we recommend that you retain the original escape operation to prevent errors caused by lack of verification. The function complies with the following rules when it returns a value:
    - In this configuration, the return value is still a JSON string, which can be parsed as JSON data, without the need to use the REPLACE or REGEXP_REPLACE function to replace backslashes (\).
    - Duplicate keys are allowed in a JSON object. If duplicate keys exist, the data can be parsed.
      -- The return value is 1. select get_json_object('{"a":"1","a":"2"}', '$.a');
    - The encoded strings that correspond to emojis are supported. However, DataWorks does not allow you to enter emojis. DataWorks allows you to enter only the encoded strings that correspond to emojis to MaxCompute by using a tool, such as Data Integration. DataWorks uses the GET_JSON_OBJECT function to process the data.
      -- The return value is an emoji. select get_json_object('{"a":"<Emoji>"}', '$.a');
    - The output results are displayed in alphabetical order.
      -- The return value is {"b":"1","a":"2"}. select get_json_object('{"b":"1","a":"2"}', '$');
  - If you execute the set odps.sql.udf.getjsonobj.new=false; statement, this function escapes JSON reserved characters when it returns a value. The function complies with the following rules when it returns a value:
    - JSON reserved characters such as line feeds (\n) and quotation marks (") are displayed as '\n' and '\"'.
    - Each key in a JSON object must be unique. If duplicate keys exist, the data may fail to be parsed. Sample statement:
```
-- The return value is null. 
select get_json_object('{"a":"1","a":"2"}', '$.a');
```
    - The encoded strings that correspond to emojis cannot be parsed. Sample statement:
```
-- The return value is null. 
select get_json_object('{"a":"<Emoji>"}', '$.a');
```
    - The output results are displayed in alphabetical order. Sample statement:
```
-- The return value is {"a":"2","b":"1"}. 
select get_json_object('{"b":"1","a":"2"}', '$');
```
  Note
  For MaxCompute projects that were created on or after January 21, 2021, the GET_JSON_OBJECT function retains the original strings when it returns a value. For MaxCompute projects that are created before January 21, 2021, the GET_JSON_OBJECT function escapes JSON reserved characters when it returns a value. The following example helps you determine how the GET_JSON_OBJECT function returns a value in a MaxCompute project.
```
select get_json_object('{"a":"[\\"1\\"]"}', '$.a');
-- Return JSON reserved characters by using escape characters.
[\"1\"]

-- Return the original strings.
["1"]
```
  You can submit an application or search for the DingTalk group ID 11782920 to join the MaxCompute developer community DingTalk group and request MaxCompute technical support engineers to configure the GET_JSON_OBJECT function to retain original strings. This way, you do not need to frequently specify set odps.sql.udf.getjsonobj.new=false; for a session.

Examples

Example 1: Extract information from the JSON object src_json.json. Sample statement:

-- The JSON string src_json.json contains the following content: 
+----+
json
+----+
{"store":
{"fruit":[{"weight":8,"type":"apple"},{"weight":9,"type":"pear"}],
"bicycle":{"price":19.95,"color":"red"}
},
"email":"amy@only_for_json_udf_test.net",
"owner":"amy"
}
-- Extract the information of the owner field and return amy. 
select get_json_object(src_json.json, '$.owner') from src_json;
-- Extract the information of the first array in the store.fruit field and return {"weight":8,"type":"apple"}. 
select get_json_object(src_json.json, '$.store.fruit[0]') from src_json;
-- Extract the information of the non-existent field and return null. 
select get_json_object(src_json.json, '$.non_exist_key') from src_json;

Example 2: Extract information from a JSON object of the ARRAY type. Sample statement:

-- The return value is 2222. 
select get_json_object('{"array":[["aaaa",1111],["bbbb",2222],["cccc",3333]]}','$.array[1][1]');
-- The return value is ["h0","h1","h2"]. 
set odps.sql.udf.getjsonobj.new=true;
select get_json_object('{"aaa":"bbb","ccc":{"ddd":"eee","fff":"ggg","hhh":["h0","h1","h2"]},"iii":"jjj"}','$.ccc.hhh[*]');
-- The return value is ["h0","h1","h2"]. 
set odps.sql.udf.getjsonobj.new=false;
select get_json_object('{"aaa":"bbb","ccc":{"ddd":"eee","fff":"ggg","hhh":["h0","h1","h2"]},"iii":"jjj"}','$.ccc.hhh');
-- The return value is h1. 
select get_json_object('{"aaa":"bbb","ccc":{"ddd":"eee","fff":"ggg","hhh":["h0","h1","h2"]},"iii":"jjj"}','$.ccc.hhh[1]');

Example 3: Extract information from a JSON object that includes a period (.). Sample statement:

-- Create a table. 
create table mf_json (id string, json string);
-- Insert data into the table. The key in the data contains a period (.). 
insert into table mf_json (id, json) values ("1", "{
\"China.beijing\":{\"school\":{\"id\":0,\"book\":[{\"title\": \"A\",
\"price\": 8.95},{\"title\": \"B\",\"price\": 10.2}]}}}");
-- Insert data into the table. The key in the data does not contain a period (.). 
insert into table mf_json (id, json) values ("2", "{
\"China_beijing\":{\"school\":{\"id\":0,\"book\":[{\"title\": \"A\",
\"price\": 8.95},{\"title\": \"B\",\"price\": 10.2}]}}}");
-- Query the value of id in the JSON object whose key is China.beijing. 0 is returned. Only [''] can be used to specify the key because the key contains a period (.). This way, MaxCompute can parse the key. 
select get_json_object(json, "$['China.beijing'].school['id']") from mf_json where id =1;
-- Query the value of id in the JSON object whose key is China_beijing. 0 is returned. You can use one of the following statements: 
select get_json_object(json, "$['China_beijing'].school['id']") from mf_json where id =2;
select get_json_object(json, "$.China_beijing.school['id']") from mf_json where id =2;

Example 4: The json parameter is empty or invalid. Sample statement:

-- The return value is null. 
select get_json_object('','$.array[1][1]');
-- The return value is null. 
select get_json_object('"array":["aaaa",1111],"bbbb":["cccc",3333]','$.array[1][1]');

Example 5: Escape a JSON string. Sample statement:

set odps.sql.udf.getjsonobj.new=true;
-- The return value is "1". 
select get_json_object('{"a":"\\"1\\"","b":"2"}', '$.a'); 
-- The return value is '1'. 
select get_json_object('{"a":"\'1\'","b":"2"}', '$.a');

INDEX

Syntax
```
index(<var1>[<var2>])
```
Description
- If var1 is of the map<K, V> type, this function obtains the element that is at position var2 in var1. The position numbers of elements are counted from left to right and start from 0.
- If var1 is of the map<K, V> type, this function obtains the value whose key is var2 in var1.
Note
When you use this function, you must remove index and directly execute <var1>[<var2>]. Otherwise, an error is returned.
Parameters
- var1: required. The value must be of the array<T> or map<K, V> type. T in array<T> specifies the data type of the elements in an array. The elements can be of any data type. K and V in map<K, V> specify the keys and values of a map.
- var2: required.
  - If var1 is of the array<T> type, var2 must be of the BIGINT type and greater than or equal to 0.
  - If var1 is of the map<K, V> type, var2 must be of the same data type as K.
Return value
- If var1 is of the array<T> type, a value of the data type that is specified by T is returned. The return value varies based on the following rules:
  - If the number of elements in var1 is less than var2, null is returned.
  - If var1 is null, null is returned.
- If var1 is of the map<K, V> type, a value of the data type that is specified by V is returned. The return value varies based on the following rules:
  - If map<K, V> does not contain a key whose value is var2, null is returned.
  - If var1 is null, null is returned.

Examples

Example 1: var1 is of the array<T> type. Sample statement:

-- The return value is c. 
select array('a','b','c')[2];

Example 2: var1 is of the map<K, V> type. Sample statement:

-- The return value is 1. 
select str_to_map("test1=1,test2=2")["test1"];

INLINE

Syntax

inline(array<struct<f1:T1, f2:T2[, ...]>>)

Description
Expands a given struct array. Each array element corresponds to a row and each struct element corresponds to a column in each row.
Parameters
f1:T1 and f2:T2: required. Values of any data type. f1 and f2 specify member variables, T1 specifies the value of f1, and T2 specifies the value of f2.
Return value
The expanded data of the struct array is returned.

Examples

The t_table table contains the t_struct (STRUCT<user_id:BIGINT,user_name:STRING,married:STRING,weight:DOUBLE>) fields. Data in the table:

+----------+
| t_struct |
+----------+
| {user_id:10001, user_name:LiLei, married:N, weight:63.5} |
| {user_id:10002, user_name:HanMeiMei, married:Y, weight:43.5} |
+----------+

Sample statement:

-- Expand the t_struct column. 
select inline(array(t_struct)) from t_table;
-- The following result is returned: 
+------------+-----------+---------+------------+
| user_id    | user_name | married | weight     |
+------------+-----------+---------+------------+
| 10001      | LiLei     | N       | 63.5       |
| 10002      | HanMeiMei | Y       | 43.5       |
+------------+-----------+---------+------------+

JSON_TUPLE

Syntax

string json_tuple(string <json>, string <key1>, string <key2>,...)

Description
Extracts strings from a standard JSON string based on a set of input keys, such as (key1,key2,...).
Parameters
- json: required. A value of the STRING type. This parameter specifies a standard JSON string.
- key: required. A value of the STRING type. This parameter is used to describe the path of a JSON object in the JSON string. The value cannot start with a dollar sign ($). You can enter multiple keys at a time. MaxCompute parses JSON objects by using . or ['']. If a key in a JSON object includes a period (.), [''] can be used.
Return value
A value of the STRING type is returned.
Note
- If json is empty or invalid, null is returned.
- If key is empty, invalid, or does not exist in the JSON string, null is returned.
- If json is valid and key exists, the related string is returned.
- This function can parse JSON data that contains Chinese characters.
- This function can parse nested JSON data.
- This function can parse JSON data that contains nested arrays.
- The parsing action is equivalent to the execution of GET_JSON_OBJECT along with set odps.sql.udf.getjsonobj.new=true;. To obtain multiple objects from a JSON string, you must call the GET_JSON_OBJECT function multiple times. As a result, the JSON string is parsed multiple times. The JSON_TUPLE function allows you to enter multiple keys at a time and the JSON string is parsed only once. JSON_TUPLE is more efficient than GET_JSON_OBJECT.
- JSON_TUPLE is a user-defined table-valued function (UDTF). If you want to select some columns from a table, use JSON_TUPLE together with the LATERAL VIEW clause.

JSON_OBJECT

Syntax

json json_object(<key1>,<value1>[,<keyn>,<valuen>])

Description
Returns a JSON object that contains key-value pairs.
Parameters
- key: You must specify at least one key. The key must be of the STRING type.
- value: You must specify at least one value. The value must be of the STRING, BIGINT, INT, or BOOLEAN type.
Return value
The return value is of the JSON type.

Examples

Example 1: Generate a JSON object that contains only one key-value pair.
```
-- Generate a JSON object.
select json_object('a', 123);
```
The following result is returned:
```
+-----+
| _c0 |
+-----+
| {"a":123} |
+-----+
```

Example 2: Generate a JSON object that contains multiple key-value pairs.

-- Generate a JSON object.
select json_object('a', 123,'b','hello');

The following result is returned:

+-----+
| _c0 |
+-----+
| {"a":123,"b":"hello"} |
+-----+

JSON_ARRAY

Syntax
```
json json_array(<element>)
```
Description
Evaluates a possibly empty list of values and returns a JSON array that contains these values.
Parameters
element: required. The values in the list specified by this parameter must be of the STRING, BIGINT, BOOLEAN, or JSON type.
Return value
The return value is of the JSON type.

Examples

-- Generate a JSON array.
select json_array('a', 45, true, 13, json '{"a":456}');

The following result is returned:

+-----+
| _c0 |
+-----+
| ["a",45,true,13,{"a":456}] |
+-----+

JSON_EXPLODE

Syntax
```
JSON_EXPLODE(JSON <var>)
```
Description
Supports expanding each element in a JSON array or JSON object into multiple output rows.
Parameters
var: Required. Supports JSON_ARRAY or JSON_OBJECT types. Does not support JSON constants, JSON STRING, JSON NUMBER, JSON BOOLEAN, and NULL types.
Return value
The function returns transformed rows in the following format:
```
+-------+-------+
|  KEY  | VALUE |
|-------+-------|
```
- For a JSON ARRAY type var, the outermost of the JSON ARRAY is expanded into multiple rows of JSON data, with KEY as NULL and VALUE as the array element.
- For a JSON OBJECT type var, each KEY-VALUE pair of the outermost of the JSON OBJECT is expanded into rows with two columns: A STRING type KEY column is the KEY of the JSON OBJECT, and a JSON type VALUE column is the VALUE of the JSON OBJECT.

Examples

Create a table named table_json and insert data into it, where the first row is a JSON OBJECT type and the second row is a JSON ARRAY type.

-- Create table
CREATE TABLE table_json(c1 json);
-- Insert data
INSERT INTO table_json(c1) SELECT JSON_OBJECT('a', 123,'b','hello');
INSERT INTO table_json(c1) SELECT JSON_ARRAY(1, true, 2, json'{"a":456}');
-- View table data
SELECT * FROM table_json;

The query returns the following result:

+-----------------------------+
| c1                            |
+-------------------------------+
| {"a":123,"b":"hello"}         |
| [1,true,2,{"a":456}]          |
+-------------------------------+

The JSON_EXPLODE function expands each element in a JSON array or JSON object into multiple output rows. Below is a sample code:

SELECT JSON_EXPLODE(table_json.c1) FROM table_json;

The query returns the following result:

+-----+------------+
| key | value      |
+-----+------------+
| \N  | 1          |
| \N  | true       |
| \N  | 2          |
| \N  | {"a":456}  |
| a   | 123        |
| b   | hello      |
+-----+------------+

Note

When transforming JSON data, elements within the same JSON data maintain their original sequence. However, the order of multiple JSON data entries may vary.

JSON_EXTRACT

Syntax
```
json json_extract(<json>, <json_path>)
```
Description
Parses the value of json_path in a JSON expression. Note that an error is returned if the value of json_path is invalid.
Parameters
- json: required. This parameter specifies the JSON expression that you want to process.
- json_path: required. This parameter specifies the JSON path of the value that you want to be returned.
Return value
The return value is of the JSON type.
Examples
- Example 1: Obtain the value of the key a from a JSON object.
```
select json_extract(json '{"a":1, "b":2}', '$.a');
```
  The following result is returned:
```
+-----+
| _c0 |
+-----+
| 1   |
+-----+
```
- Example 2: NULL is returned if the specified key does not exist in the JSON object.
```
select json_extract(json '{"a":1, "b":2}', 'strict $.c');
```
  The following result is returned:
```
+-----+
| _c0 |
+-----+
| NULL |
+-----+
```
- Example 3: The error message Invalid argument - Param json path $invalid_json_path is invalid is returned if the JSON path that you specified is invalid.
```
select json_extract(json '{"a":1, "b":2}', '$a');
```
  The following result is returned:
```
-- An error message is returned.
Invalid argument - Param json path $invalid_json_path is invalid
```

JSON_EXISTS

Syntax

boolean json_exists(<json>, <json_path>)

Description
Determines whether the JSON value of json_path exists.
Parameters
- json: required. This parameter specifies the JSON expression that you want to process.
- json_path: required. This parameter specifies the JSON path in which a specific JSON value is checked.
Return value
The return value is true or false, which is of the BOOLEAN type.
Examples
- Example 1: Check whether the value of the key a exists in a JSON object.
```
select json_exists(json '{"a":1, "b":2}', '$.a');
```
  The following result is returned:
```
+------+
| _c0  |
+------+
| true |
+------+
```
- Example 2: Check whether the value of the key c exists in a JSON object.
```
select json_exists(json '[1,2, {"a":34}]', '$[2].a');
```
  The following result is returned:
```
+------+
| _c0  |
+------+
| true |
+------+
```
- Example 3: Check whether values that are obtained based on a specific subscript exist.
```
select json_exists(json '{"a":1, "b":2}', 'strict $.c');
```
  The following result is returned:
```
+------+
| _c0  |
+------+
| false |
+------+
```

JSON_PRETTY

Syntax
```
string json_pretty(<json>)
```
Description
Returns a JSON value in a format that is easy to read by adding line breaks and spaces.
Parameters
json: required. This parameter specifies the JSON expression that you want to process.
Return value
The return value is of the STRING type.

Examples

-- Beautify a JSON object.
select json_pretty(json '{"a":1, "b":2}');

The following result is returned:

+-----+
| _c0 |
+-----+
| {
    "a":1,
    "b":2
} |
+-----+

JSON_TYPE

Syntax

string|number|boolean|null|object|array json_type(<json>)

Description
Returns the data type of a JSON value.
Parameters
json: required. The JSON expression that you want to process.
Return value
A value of the STRING type is returned.
Examples
- Example 1: Return the ARRAY type for a JSON value.
```
select json_type(json '[{"a":1}, 23]');
```
  The following result is returned:
```
+-----+
| _c0 |
+-----+
| array |
+-----+
```
- Example 2: Return a numeric data type for a JSON value.
```
select json_type(json '123');
```
  The following result is returned:
```
+-----+
| _c0 |
+-----+
| number |
+-----+
```
- Example 3: Return the STRING type for a JSON value.
```
select json_type(json '"123"');
```
  The following result is returned:
```
+-----+
| _c0 |
+-----+
| string |
+-----+
```

JSON_FORMAT

Syntax
```
string json_format(<json>)
```
Description
Converts a value of the JSON data type into a value of the STRING data type. By default, JSON data is not automatically prettified.
Parameters
json: required. This parameter specifies the JSON data that you want to convert.
Return value
The return value is of the STRING type.
Examples
- Example 1: Convert JSON NUMBER data into strings.
```
select json_format(json '123');
```
  The following result is returned:
```
+-----+
| _c0 |
+-----+
| 123 |
+-----+
```
- Example 2: Convert JSON STRING data into strings.
```
select json_format(json '"123"');
```
  The following result is returned:
```
+-----+
| _c0 |
+-----+
| "123" |
+-----+
```

JSON_PARSE

Syntax
```
json json_parse(<string>)
```
Description
Converts a value of the STRING data type into a value of the JSON data type. If a non-JSON-formatted value is converted into a value of the STRING data type, an error is returned.
Parameters
string: required. This parameter specifies the string that you want to process.
Return value
The return value is of the JSON type.
Examples
- Example 1: Convert a string into data of the JSON type.
```
select json_parse('{"a":1, "b":2}');
```
  The following result is returned:
```
+-----+
| _c0 |
+-----+
| {"a":1,"b":2} |
+-----+
```
- Example 2: Convert a string into data of the JSON type.
```
select json_parse('"abc"');
```
  The following result is returned:
```
+-----+
| _c0 |
+-----+
| "abc" |
+-----+
```
- Example 3: Convert an invalid string into data of the JSON type. An error is reported.
```
select json_parse('abc');
```
  The following result is returned:
```
Invalid input syntax for type json, detail:Token "abc" is invalid.
```

JSON_VALID

Syntax
```
boolean json_valid(<string>)
```
Description
Determines whether a string is in a valid JSON format.
Parameters
string: required. This parameter specifies the JSON string that you want to process.
Return value
The return value is true or false, which is of the BOOLEAN type.
Examples
- Example 1: Check whether "abc" is a valid JSON string.
```
select json_valid('"abc"');
```
  The following result is returned:
```
+------+
| _c0  |
+------+
| true |
+------+
```
- Example 2: Check whether abc is a valid JSON string.
```
select json_valid('abc');
```
  The following result is returned:
```
+------+
| _c0  |
+------+
| false |
+------+
```

CAST

Syntax

json/string/bigint/int/tinyint/smallint/double/float/boolean/sql-type 
cast(json as string/
     string as json/
     json as bigint/
     bigint as json/
     json as int/
     int as json/
     json as tinyint/
     tinyint as json/
     json as smallint/
     smallint as json/
     json as double/
     double as json/
     json as float/
     float as json/
     boolean as json/
     json as boolean/
     null as json/
     json 'null' as ...
     )

Description
Supports conversion between basic data types and JSON data types.
Parameters
- The following data types are supported: JSON, STRING, BIGINT, INT, TINYINT, SMALLINT, DOUBLE, FLOAT, BOOLEAN, and SQL-TYPE.
- To convert JSON data into data of the STRING data type, you must make sure the JSON data does not contain values of the ARRAY or OBJECT data type.
- After you convert data of the STRING data type into JSON data, the output JSON data contains only values of the STRING data type. Take note of the following differences between the CAST function and the JSON_PARSE and JSON_FORMAT functions:
  - You can use the JSON_PARSE function to convert only valid JSON strings into JSON data, which can be a JSON object.
  - You can use the CAST function to convert any string into a JSON string that contains values of the STRING data type.
- JSON 'null' values and common null values are converted into SQL null values.
Return value
JSON data or data of a specific basic data type is returned.

Examples

Example 1: Perform conversion between the STRING data type and the JSON data type.

-- Convert a JSON number into a value of the STRING type.
select cast(json '123' as string);
-- The following result is returned:
+-----+
| _c0 |
+-----+
| 123 |
+-----+

-- Convert a JSON string into a value of the STRING type.
select cast(json '"abc"' as string);
-- The following result is returned:
+-----+
| _c0 |
+-----+
| abc |
+-----+

-- Convert a JSON boolean into a value of the STRING type.
select cast(json 'true' as string); 
-- The following result is returned:
+-----+
| _c0 |
+-----+
| TRUE |
+-----+

-- Convert JSON 'null' into a null value of the STRING type.
select cast(json 'null' as string);
-- The following result is returned:
+-----+
| _c0 |
+-----+
| NULL |
+-----+

-- Convert a string into a value of a JSON type.
select cast('{"a":2}' as json); 
-- The following result is returned:
+-----+
| _c0 |
+-----+
| "{\"a\":2}" |
+-----+


-- Incorrect sample statement. Convert JSON expressions of the ARRAY or OBJECT type into values of the STRING type. 
select cast(json '{"a":2}' as string);
-- An error is returned.
FAILED: ODPS-0123091:Illegal type cast - Unsupported cast from json array/object to string

Example 2: Perform conversion between the NUMBER data type and the JSON data type.

-- Convert a JSON number into a value of the BIGINT type.
select cast(json '123' as bigint);
-- The following result is returned:
+------------+
| _c0        |
+------------+
| 123        |
+------------+

-- Convert a JSON number into a value of the FLOAT type.
select cast(json '"1.23"' as float);
-- The following result is returned:
+------+
| _c0  |
+------+
| 1.23 |
+------+

-- Convert a JSON number into a value of the DOUBLE type.
select cast(json '1.23' as double);
-- The following result is returned:
+------------+
| _c0        |
+------------+
| 1.23       |
+------------+

-- Convert a value of the INT type into a JSON number.
select cast(123 as json);
-- The following result is returned:
+-----+
| _c0 |
+-----+
| 123 |
+-----+

-- Convert a value of the FLOAT type into a JSON number.
select cast(1.23 as json);
-- The following result is returned:
+-----+
| _c0 |
+-----+
| 1.23 |
+-----+

Example 3: Perform conversion between the BOOLEAN data type and the JSON data type.

-- Convert a value of the BOOLEAN type into the BIGINT type.
select cast(true as json);
-- The following result is returned:
+-----+
| _c0 |
+-----+
| true |
+-----+

-- Convert a JSON boolean into a value of the BOOLEAN type.
select cast(json 'false' as boolean);
-- The following result is returned:
+------+
| _c0  |
+------+
| false |
+------+

-- Convert a JSON string into a value of the BOOLEAN type.
select cast(json '"abc"' as boolean);
-- The following result is returned:
+------+
| _c0  |
+------+
| true |
+------+

-- Convert a JSON array or JSON object into the BOOLEAN type.
select cast(json '[1,2]' as boolean);
-- An error is returned.
Unsupported cast from json array/object to boolean

Example 4: Perform conversion between a null value and a value of the JSON data type.

-- Convert a null value into a value of the STRING type.
select json_type(cast(null as json));
-- The following result is returned:
+-----+
| _c0 |
+-----+
| NULL |
+-----+

MAP

Syntax

map(K, V) map(K <key1>, V <value1>, K <key2>, V <value2>[, ...])

Description
Creates a map based on given key-value pairs.
Parameters
- key: required. All keys must be of the same data type after implicit conversions. Only basic data types are supported.
- value: required. All values must be of the same data type or must be of the same data type after implicit conversions. Data types other than DECIMAL are supported.
Return value
A value of the MAP type is returned.
Note
You can configure the odps.sql.map.key.dedup.policy parameter at the session level to specify the method that is used to process duplicate keys. Valid values:
- exception: An error is returned.
- last_win: The latter key overwrites the former key.
If you do not specify the parameter, the default value last_win is used.

Examples

Example 1: No duplicate keys exist. The t_table table contains the c1 (BIGINT), c2 (STRING), c3 (STRING), c4 (BIGINT), and c5 (BIGINT) columns. Data in the table:

+------------+----+----+------------+------------+
| c1         | c2 | c3 | c4         | c5         |
+------------+----+----+------------+------------+
| 1000       | k11 | k21 | 86         | 15         |
| 1001       | k12 | k22 | 97         | 2          |
| 1002       | k13 | k23 | 99         | 1          |
+------------+----+----+------------+------------+

Sample statement:

-- Define a map based on the key-value pairs between the c2 and c4 columns, and between the c3 and c5 columns. 
select map(c2,c4,c3,c5) from t_table;
-- The following result is returned: 
+------+
| _c0  |
+------+
| {k11:86, k21:15} |
| {k12:97, k22:2} |
| {k13:99, k23:1} |
+------+

Example 2: Duplicate keys exist. The t_table table contains the c1 (BIGINT), c2 (STRING), c3 (STRING), c4 (BIGINT), and c5 (BIGINT) columns. Data in the table:

1000,'k11','k11',86,15
1001,'k12','k22',97,2
1002,'k13','k23',99,1
1003,'k13','k24',100,1
1004,'k12','k25',95,1

Sample statement:

-- Define a map based on the key-value pairs between the c2 and c4 columns, and between the c3 and c5 columns. 
select map(c2,c4,c3,c5) from t_table;
-- The following result is returned: 
+------+
| _c0  |
+------+
| {'k11':15} |
| {'k12':97, 'k22':2} |
| {'k13':99, 'k23':1} |
| {'k13':100, 'k24':1} |
| {'k12':95, 'k25':1} |
+------+

MAP_CONCAT

Syntax

map<K, V> map_concat([string <mapDupKeyPolicy>,] map<K, V> <a>, map<K, V> <b>[,...])

Description
Calculates the union of multiple maps.
Parameters
- mapDupKeyPolicy: optional. A value of the STRING type. This parameter specifies the method that is used to process duplicate keys. Valid values:
  - exception: An error is returned.
  - last_win: The latter key overwrites the former key.
  You can also specify the odps.sql.map.key.dedup.policy parameter at the session level to configure the method that is used to process duplicate keys. For example, you can set odps.sql.map.key.dedup.policy to exception. If you do not specify this parameter, the default value last_win is used.
  Note
  The behavior implementation of MaxCompute is determined based on mapDupKeyPolicy. If you do not specify mapDupKeyPolicy, the value of odps.sql.map.key.dedup.policy is used.
- a and b: required. These parameters specify maps. The keys of the maps must be of the same data type, and the values of the maps must be of the same data type. K and V in map<K, V> specify the keys and values of a map.
Return value
A value of the MAP type is returned. The return value varies based on the following rules:
- A map is null or the key of a map is null.
- The data types of multiple maps are different.

Examples

-- The return value is {1:a, 2:b, 3:c}. 
select map_concat(map(1, 'a', 2, 'b'), map(3, 'c'));
-- The return value is {1:a, 2:d, 3:c}. 
select map_concat('last_win', map(1, 'a', 2, 'b'), map(3, 'c'), map(2, 'd'));

MAP_ENTRIES

Syntax

array<struct<K, V>> map_entries(map<K, V> <a>):

Description
Converts key-value pairs in Map a into a struct array.
Parameters
a: required. This parameter specifies a map. K and V in map<K, V> specify the keys and values of a map.
Return value
A struct array is returned. If the input is null, null is returned.

Examples

-- The return value is [{key:1, value:a}, {key:2, value:b}]. 
select map_entries(map(1,  'a',  2,  'b'));

MAP_FILTER

Syntax

map<K, V> map_filter(map<K, V> <input>, function <K, V, boolean> <predicate>)

Description
Filters the elements in Map input and retains only the elements that meet the predicate condition.
Parameters
- input: required. A value of the MAP type. K and V in map<K, V> specify the keys and values of a map.
- predicate: required. This parameter specifies the built-in function, user-defined function, or expression that is used to filter the elements in the map. The predicate condition consists of two input parameters that correspond to the keys and values in input. The output result is of the BOOLEAN type.
Return value
A value of the MAP type is returned.

Examples

-- The return value is {-30:100, 20:50}. 
select map_filter(map(10, -20, 20, 50, -30, 100, 21, null), (k, v) -> (k+v) > 10);

MAP_FROM_ARRAYS

Syntax

map<K, V> map_from_arrays([string <mapDupKeyPolicy>,] array<K> <a>, array<V> <b>))

Description
Creates a map based on Array a and Array b.
Parameters
- mapDupKeyPolicy: optional. A value of the STRING type. This parameter specifies the method that is used to process duplicate keys. Valid values:
  - exception: An error is returned.
  - last_win: The latter key overwrites the former key.
  You can also specify the odps.sql.map.key.dedup.policy parameter at the session level to configure the method that is used to process duplicate keys. For example, you can set odps.sql.map.key.dedup.policy to exception. If you do not specify this parameter, the default value last_win is used.
  Note
  The behavior implementation of MaxCompute is determined based on mapDupKeyPolicy. If you do not specify mapDupKeyPolicy, the value of odps.sql.map.key.dedup.policy is used.
- a: required. This parameter specifies an array. This parameter corresponds to the key in the generated map. K in array<K> specifies the data type of the elements in the array. The elements can be of any data type.
- b: required. This parameter specifies an array. This parameter corresponds to the value in the generated map. V in array<V> specifies the data type of the elements in the array. The elements can be of any data type.
Return value
A value of the MAP type is returned. The return value varies based on the following rules:
- If Array a or Array b is null, null is returned.
- If Array a contains a null element or the two arrays are of different lengths, an error is returned.

Examples

-- The return value is {1:2, 3:4}. 
select map_from_arrays(array(1.0, 3.0), array('2', '4'));
-- The return value is {1:2, 3:6}. 
select map_from_arrays('last_win', array(1.0, 3.0, 3), array('2', '4', '6'));

MAP_FROM_ENTRIES

Syntax

map<K, V> map_from_entries([string <mapDupKeyPolicy>,] array <struct<K, V> , struct<K, V>[,...]>)

Description
Creates a map based on given struct arrays.
Parameters
- mapDupKeyPolicy: optional. A value of the STRING type. This parameter specifies the method that is used to process duplicate keys. Valid values:
  - exception: An error is returned.
  - last_win: The latter key overwrites the former key.
  You can also specify the odps.sql.map.key.dedup.policy parameter at the session level to configure the method that is used to process duplicate keys. For example, you can set odps.sql.map.key.dedup.policy to exception. If you do not specify this parameter, the default value last_win is used.
  Note
  The behavior implementation of MaxCompute is determined based on mapDupKeyPolicy. If you do not specify mapDupKeyPolicy, the value of odps.sql.map.key.dedup.policy is used.
- Input parameters of this function are of the STRUCT type. K corresponds to the keys in the generated map. V corresponds to the values in the generated map. K and V in struct<K, V> specify the keys and values of a struct array.
Return value
A value of the MAP type is returned. The return value varies based on the following rules:
- If a struct array is null, null is returned.
- If the number of fields in a struct array is not 2 or the key of a struct array is null, an error is returned.

Examples

-- The return value is {1:a, 2:b}. 
select map_from_entries(array(struct(1, 'a'), struct(2, 'b')));
-- The return value is {1:a, 2:c}. 
select map_from_entries(array(struct(1, 'a'), struct(2, 'b'), struct(2, 'c')));

MAP_KEYS

Syntax
```
array<K> map_keys(map<K, V> <a>)
```
Description
Returns all keys in Map a as an array.
Parameters
a: required. This parameter specifies a map. K and V in map<K, V> specify the keys and values of a map.
Return value
A value of the ARRAY type is returned. If the input map is null, null is returned.

Examples

The t_table_map table contains the c1 (BIGINT) and t_map (MAP<STRING,BIGINT>) columns. Data in the table:

+------------+-------+
| c1         | t_map |
+------------+-------+
| 1000       | {k11:86, k21:15} |
| 1001       | {k12:97, k22:2} |
| 1002       | {k13:99, k23:1} |
+------------+-------+

Sample statement:

-- Return keys in the t_map column as an array. 
select c1, map_keys(t_map) from t_table_map;
-- The following result is returned: 
+------------+------+
| c1         | _c1  |
+------------+------+
| 1000       | [k11, k21] |
| 1001       | [k12, k22] |
| 1002       | [k13, k23] |
+------------+------+

MAP_VALUES

Syntax
```
array<V> map_values(map<K, V> <a>)
```
Description
Returns all values in Map a as an array.
Parameters
a: required. This parameter specifies a map. K and V in map<K, V> specify the keys and values of a map.
Return value
A value of the ARRAY type is returned. If the input map is null, null is returned.

Examples

The t_table_map table contains the c1 (BIGINT) and t_map (MAP<STRING,BIGINT>) columns. Data in the table:

+------------+-------+
| c1         | t_map |
+------------+-------+
| 1000       | {k11:86, k21:15} |
| 1001       | {k12:97, k22:2} |
| 1002       | {k13:99, k23:1} |
+------------+-------+

Sample statement:

-- Return keys in the t_map column as an array. 
select c1,map_values(t_map) from t_table_map;
-- The following result is returned: 
+------------+------+
| c1         | _c1  |
+------------+------+
| 1000       | [86, 15] |
| 1001       | [97, 2] |
| 1002       | [99, 1] |
+------------+------+

MAP_ZIP_WITH

Syntax

<K, V1, V2, V3> map<K, V3> map_zip_with(map<K, V1> <input1>, map<K, V2> <input2>, function<K, V1, V2, V3> <func>)

Description
Merges Map input1 and Map input2 into a new map. The keys of the new map are the union of the keys of the two input maps. The value of each key of the new map is calculated by using func.
Parameters
- input1 and input2: required. This parameter specifies a map. K and V in map<K, V> specify the keys and values of a map.
- func: required. func consists of three input parameters, which correspond to a key, the value that corresponds to the key in input1, and the value that corresponds to the key in input2. If the key does not exist in input1 or input2, null is used to replace the value that corresponds to the key in func.
Return value
Data of the data type defined by func is returned.

Examples

-- The return value is {1:[1, 1, 4], 2:[2, 2, 5], 3:[3, null, null], 4:[4, null, 7]}. 
select map_zip_with(map(1, 1, 2, 2, 3, null), map(1, 4, 2, 5, 4, 7), (k, v1, v2) -> array(k, v1, v2));

MULTIMAP_FROM_ENTRIES

Syntax

	multimap_from_entries(array<struct<K, V>>)

Description
Returns a map that consists of the keys in a struct array and an array containing all values.
Parameters
array<struct<K, V>>: a struct array that consists of key-value pairs.
Return value
A map that consists of keys in a struct array and an array containing all values is returned. The map is in the map<K, array<V>> format.
- Each key in the returned map can be associated with multiple values, and the associated values are in the same array.
- If the array is null, null is returned.
- If the number of fields in the struct array is not 2 or the keys contain null values, an error is returned.

Examples

SELECT multimap_from_entries(array(struct(1, 'a'), 
                                     struct(2, 'b'), 
                                     struct(1, 'c')));

The following result is returned:

{1 : ['a', 'c'],  2: ['b']}

NAMED_STRUCT

Syntax

struct named_struct(string <name1>, T1 <value1>, string <name2>, T2 <value2>[, ...])

Description
Creates a struct based on given name-value pairs.
Parameters
- value: required. A value of any data type.
- name: required. The column name of the STRING type. This parameter is a constant.
Return value
A value of the STRUCT type is returned. Columns are sequentially named as name1, name2, ....

Examples

-- The return value is {user_id:10001, user_name:LiLei, married:F, weight:63.5}. 
select named_struct('user_id',10001,'user_name','LiLei','married','F','weight',63.50);

NGRAMS

Syntax
```
ngrams(array(T), n)
```
Description
Returns an n-gram array of the elements of the specified array.
Parameters
- array: an input array.
- n: the number of elements in each subset.
Return value
Returns an n-gram array of the elements of the specified array.
If the n<=0 condition is met, an error is returned.

Examples

SELECT ngrams(array('foo', 'bar', 'baz', 'foo'), 2);

The following result is returned:

[['foo', 'bar'], ['bar', 'baz'], ['baz', 'foo']]

SELECT ngrams(array('foo', 'bar', 'baz', 'foo'), 3);

The following result is returned:

[['foo', 'bar', 'baz'], ['bar', 'baz', 'foo']]

SELECT ngrams(array('foo', 'bar', 'baz', 'foo'), 4);

The following result is returned:

[['foo', 'bar', 'baz', 'foo']]

SELECT ngrams(array('foo', 'bar', 'baz', 'foo'), 5);

The following result is returned:

[['foo', 'bar', 'baz', 'foo']]

SELECT ngrams(array(1, 2, 3, 4), 2);

The following result is returned:

[[1, 2], [2, 3], [3, 4]]

POSEXPLODE

Syntax
```
posexplode(array<T> <a>)
```
Description
Converts Array a into a table that has two columns. The first column lists the position of each element in the array, starting from 0. The second column lists the elements.
Parameters
a: required. This parameter specifies an array. T in array<T> specifies the data type of the elements in the array. The elements can be of any data type.
Return value
A table is returned.

Examples

select posexplode(array('a','c','f','b'));
-- The following result is returned: 
+------------+------------+
| pos        | val        |
+------------+------------+
| 0          | a          |
| 1          | c          |
| 2          | f          |
| 3          | b          |
+------------+------------+

REVERSE

Syntax
```
array reverse(array <value>)
```
Description
Returns an array in reverse order of the elements of a specific array.
Parameters
value: an input array.
Return value
An array in reverse order of the elements of an input array is returned. If the input value is null, null is returned.

Examples

-- The return value is [3, 4, 1, 2].
SELECT reverse(array(2, 1, 4, 3));

SEQUENCE

Syntax
```
sequence(start, stop, [step]) -> array
```
Description
Returns an array that contains the specified elements based on expressions.
Parameters
- start: an expression that represents the beginning of a sequence of elements. The sequence contains start.
  - start and stop support the following integer types: TINYINT, SMALLINT, INT, and BIGINT. The preceding integer types correspond to the following types supported by step: TINYINT, SMALLINT, INT, and BIGINT.
  - start and stop support the following date and time types: DATE, DATETIME, and TIMESTAMP. The preceding date and time types correspond to the IntervalDayTime or IntervalYearMonth type supported by step.
- stop: an expression that represents the end of a sequence of elements. The sequence contains stop.
- step: optional. The step size of a sequence of elements.
  By default, if the value of start is less than or equal to the value of stop, the value of step is 1. If the value of start is greater than the value of stop, the value of step is -1.
  If the sequence of elements is of the time type, the default value of step is 1 or -1 in days. When you configure step, take note of the following points: If the value of start is greater than the value of stop, set step to a negative value. If the value of start is less than or equal to the value of stop, set step to a positive value. If the requirements are not met, an error is returned.
Return value
An array that contains the specified elements based on expressions is returned.
- If the value of start is greater than the value of stop and the value of step is positive, an error is returned. If the value of start is less than or equal to the value of stop and the value of step is negative, an error is returned.
- By default, the SEQUENCE function can generate a maximum of 10,000 elements. You can configure the odps.sql.max.sequence.length parameter to change the maximum number of elements that can be generated.

Examples

```
SELECT sequence(1, 5);
```
The following result is returned:
```
[1, 2, 3, 4, 5]
```
```
SELECT sequence(5, 1);
```
The following result is returned:
```
[5, 4, 3, 2, 1] 
```

SELECT sequence(to_date('2018-01-01'), to_date('2018-03-01'), interval 1 month);

The following result is returned:

[2018-01-01,  2018-02-01,  2018-03-01]

SHUFFLE

Syntax
```
shuffle(array)
```
Description
Parameters
array: an input array.
Return value
Returns elements of an array in random order.
- If the input value is null, null is returned.
- The returned result of this function is undefined.

Examples

SELECT shuffle(array(1, 20, 3, 5));

The following result is returned:

[3,1,5,20]

SELECT shuffle(array(1, 20, null, 3));

The following result is returned:

[20,null,3,1]

SIZE

Syntax

int size(array<T> <a>)
int size(map<K, V> <b> )

Description
- An array as the input: Calculates the number of elements in Array a.
- A map as the input: Calculates the number of key-value pairs in Map b.
Parameters
- a: required. This parameter specifies an array. T in array<T> specifies the data type of the elements in the array. The elements can be of any data type.
- b: required. This parameter specifies a map. K and V in map<K, V> specify the keys and values of a map.
Return value
A value of the INT type is returned.
Examples
- Example 1: Calculate the number of elements in array('a','b'). Sample statement:
```
-- The return value is 2. 
select size(array('a','b'));
```
- Example 2: Calculate the number of key-value pairs in map('a',123,'b',456).
```
-- The return value is 2. 
select size(map('a',123,'b',456)); 
```

SLICE

Syntax

array<T> slice(array<T> <a>, <start>, <length>)

Description
Copies the elements in an array from the start position based on a specified length and returns the elements as a new array.
Parameters
- a: required. This parameter specifies an array. T in array<T> specifies the data type of the elements in the array. The elements can be of any data type.
- start: required. This parameter specifies the position at which the function starts to copy elements from left to right. The minimum positive value of this parameter is 1. You can also set the start parameter to a negative value. In this case, the start position is counted from the end of the array, but the elements are still copied from left to right.
- length: required. The number of elements in the returned array. The value must be greater than or equal to 0. If the value is greater than the length of the input array, this function returns a new array that consists of the elements from the start position to the end of the input array.
Return value
A value of the ARRAY type is returned.
Examples
- Example 1: Copy the elements in array(10, 20, 20, null, null, 30) from position 1 based on the length of 3. Sample statement:
```
-- The return value is [10, 20, 20]. 
select slice(array(10, 20, 20, null, null, 30), 1, 3);
```
- Example 2: Copy the elements in array(10, 20, 20, null, null, 30) from position -2 based on the length of 2. Sample statement:
```
-- The return value is [null, 30]. 
select slice(array(10, 20, 20, null, null, 30), -2, 2);  
```
- Example 3: Copy the elements in array(10, 20, 20, null, null, 30) from position 3 based on the length of 10. Sample statement:
```
-- The return value is [20, null, null, 30]. 
select slice(array(10, 20, 20, null, null, 30), 3, 10); 
```
- Example 4: Copy the elements in array(10, 20, 20, null, null, 30) from position 3 based on the length of 0. Sample statement:
```
-- The return value is []. 
select slice(array(10, 20, 20, null, null, 30), 3, 0);
```

SORT_ARRAY

Syntax

array<T> sort_array(array<T> <a>[, <isasc>])

Description
Sorts the elements in an array.
Parameters
- a: required. This parameter specifies an array. T in array<T> specifies the data type of the elements in the array. The elements can be of any data type.
- isasc: optional. This parameter specifies the order in which elements in the array are sorted. Valid values: true and false. The value true indicates the ascending order and false indicates the descending order. If you do not specify this parameter, the elements are sorted in ascending order.
Return value
A value of the ARRAY type is returned. The value null is interpreted as the minimum value.

Examples

Example 1: Sort data in the t_array table that contains the c1 ARRAY<STRING>, c2 ARRAY<INT>, and c3 ARRAY<STRING> columns. Data in the table:

+------------+---------+--------------+
| c1         | c2      | c3           |
+------------+---------+--------------+
| [a, c, f, b]  | [4, 5, 7, 2, 5, 8]  |  [You, Me, Him] |
+------------+---------+--------------+

Sort data in each column of the table. Sample statement:

-- The return value is [a, b, c, f] [2, 4, 5, 5, 7, 8] [Him, Me, You]. 
select sort_array(c1),sort_array(c2),sort_array(c3) from t_array;

Example 2: Sort the elements in array(10, 20, 40, 30, 30, null, 50) in descending order. Sample statement:

-- The return value is [50, 40, 30, 30, 20, 10, null]. 
select sort_array(array(10, 20, 40, 30, 30, null, 50), false);

SPLIT

Syntax
```
split(<str>, <pat>)
```
Description
Returns an array after str is split with pat.
Parameters
- str: required. A value of the STRING type. This parameter specifies the string that you want to split.
- pat: required. A delimiter of the STRING type. Regular expressions are supported. For more information about regular expressions, see Regular expressions.
Return value
An array is returned. The elements in the array are of the STRING type.

Examples

-- The return value is [a, b, c]. 
select split("a, b, c", ",");

STRUCT

Syntax
```
struct struct(<value1>,<value2>[, ...])
```
Description
Creates a struct based on a given value list.
Parameters
value: required. A value of any data type.
Return value
A value of the STRUCT type is returned. Columns are sequentially named as col1, col2, ....

Examples

-- The return value is {col1:a, col2:123, col3:true, col4:56.9}. 
select struct('a',123,'true',56.90);

TO_JSON

Syntax
```
string to_json(<expr>)
```
Description
Converts an expression specified by expr of a given complex data type into a JSON string.
Parameters
expr: required. An expression of the ARRAY, MAP, or STRUCT type.
Note
If the input expression is of the STRUCT type (struct<key1:value1, key2:value2>), take note of the following points:
- All keys are converted into lowercase letters when you convert the expression into a JSON string.
- If a value is null, the key-value pair to which the value belongs is not included in the JSON string that is returned. For example, if value2 is null, key2:value2 is not included in the JSON string that is returned.
Return value
A JSON string is returned.

Examples

Example 1: Convert an expression of a given complex data type into a JSON string. Sample statement:

-- The return value is {"a":1,"b":2}. 
select to_json(named_struct('a', 1, 'b', 2));
-- The return value is {"time":"26/08/2015"}. 
select to_json(named_struct('time', "26/08/2015"));
-- The return value is [{"a":1,"b":2}]. 
select to_json(array(named_struct('a', 1, 'b', 2)));
-- The return value is {"a":{"b":1}}. 
select to_json(map('a', named_struct('b', 1)));
-- The return value is {"a":1}. 
select to_json(map('a', 1));
-- The return value is [{"a":1}]. 
select to_json(array((map('a', 1))));

Example 2: The input expression is of the STRUCT type. Sample statement:

-- The return value is {"a":"B"}. If the expression of the STRUCT type is converted into a JSON string, all keys are converted into lowercase letters. 
select to_json(named_struct("A", "B"));
-- The return value is {"k2":"v2"}. The key-value pair to which null belongs is not included in the JSON string that is returned. 
select to_json(named_struct("k1", cast(null as string), "k2", "v2"));

TRANSFORM

Syntax

array<R> transform(array<T> <a>, function<T, R> <func>)

Description
Transforms the elements in Array a by using func and returns a new array.
Parameters
- a: required. This parameter specifies an array. T in array<T> specifies the data type of the elements in the array. The elements can be of any data type.
- func: required. This parameter specifies the built-in function, user-defined function, or expression that is used to transform the elements in Array a. The value must be of the same data type as the elements in Array a. R specifies the data type of the output results.
Return value
A value of the ARRAY type is returned.

Examples

-- The return value is [2, 3, 4]. 
select transform(array(1, 2, 3), x -> x + 1);

TRANSFORM_KEYS

Syntax

map<K2, V> transform_keys([string <mapDupKeyPolicy>,] map<K1, V> <input>, function<K1, V, K2> <func>)

Description
Transforms the keys in Map input by using func. The values in the map are not changed.
Parameters
- mapDupKeyPolicy: optional. A value of the STRING type. This parameter specifies the method that is used to process duplicate keys. Valid values:
  - exception: An error is returned.
  - last_win: The latter key overwrites the former key.
  You can also specify the odps.sql.map.key.dedup.policy parameter at the session level to configure the method that is used to process duplicate keys. For example, you can set odps.sql.map.key.dedup.policy to exception. If you do not specify this parameter, the default value last_win is used.
  Note
  The behavior implementation of MaxCompute is determined based on mapDupKeyPolicy. If you do not specify mapDupKeyPolicy, the value of odps.sql.map.key.dedup.policy is used.
- input: required. This parameter specifies a map. K1 and V in map<K1, V> specify the keys and values of a map.
- func: required. This parameter specifies the built-in function, user-defined function, or expression that is used to transform the keys. The function or expression consists of two input parameters that correspond to the keys and values in input. K2 specifies the data type of keys in the returned map.
Return value
A value of the MAP type is returned. If one of the new keys is null, an error is returned.

Examples

-- The return value is {-10:-20, 70:50, 71:101}. 
select transform_keys(map(10, -20, 20, 50, -30, 101), (k, v) -> k + v);
-- No error is returned. The returned result depends on the order of the elements in the input map. 
select transform_keys("last_win", map(10, -20, 20, 50, -30, 100), (k, v) -> k + v);
-- An error is returned because duplicate keys exist. 
select transform_keys("exception", map(10, -20, 20, 50, -30, 100), (k, v) -> k + v);

TRANSFORM_VALUES

Syntax

map<K, V2> transform_values(map<K, V1> <input>, function<K, V1, V2> <func>)

Description
Transforms the values in Map input by using func. The keys in the map are not changed.
Parameters
- input: required. This parameter specifies a map. K and V1 in map<K, V1> specify the keys and values of the map.
- func: required. This parameter specifies the built-in function, user-defined function, or expression that is used to transform the keys. The function or expression consists of two input parameters that correspond to the keys and values in input. V2 specifies the data type of values in the returned map.
Return value
A value of the MAP type is returned.

Examples

-- The return value is {-30:71, 10:-10, 20:NULL}. 
select transform_values(map(10, -20, 20, null, -30, 101), (k, v) -> k + v);

ZIP_WITH

Syntax

array<R> zip_with(array<T> <a>, array<S> <b>, function<T, S, R> <combiner>)

Description
Merges Array a and Array b at the element level based on element positions and a combiner and returns a new array.
Parameters
- a and b: required. This parameter specifies an array. T in array<T> and S in array<S> specify the data types of the elements in the arrays. The elements can be of any data type.
- combiner: required. This parameter specifies the built-in function, user-defined function, or expression that is used to merge Array a and Array b at the element level. The combiner consists of two input parameters. One input parameter must be of the same data type as the elements in Array a. The other input parameter must be of the same data type as the elements in Array b.
Return value
A value of the ARRAY type is returned. The return value varies based on the following rules:
- The elements in the returned array are at the same positions as the elements in Array a and Array b.
- If Array a and Array b have different lengths, the null elements are added to the shorter array before the arrays are merged.

Examples

-- The return value is [2, 4, 6, null]. 
select zip_with(array(1,2,3), array(1,2,3,4), (x,y) -> x + y);