×
Community Blog ODPS SQL - Transpose Column to Row or Row to Column

ODPS SQL - Transpose Column to Row or Row to Column

This article describes how to use TRANS_ARRAY and LATERAL VIEW EXPLODE functions to transpose columns to rows in MaxCompute.

1

By Gaoxun

Scenarios

There is such a scenario where we need to convert the self_code_list in Table A into the a_tag_list. Note that self_code has a one-to-one mapping relationship with a_tag and this mapping relationship should be represented in Table B. Generally, the conversion of the mapping relationship is solved by JOIN (you can leave a comment since a better method or any idea is welcome).

2
Figure 1 Table A

3
Figure 2 Table B

The current data structure is certainly not easy to handle. What if we convert it to the one shown in Figure 3:

4
Figure 3

With Figure 3, we can directly associate with Table B through self_code to obtain the value of a_tag as shown in Figure 4, which is clear and simple to operate.

5
Figure 4

Two Methods to Convert Columns to Rows

Here, the key issue is how to convert Table A into a table shown in Figure 3. In fact, this operation is essentially to transpose columns to rows. Specifically, it is to expand a column (usually an array) or several columns of a row of data and select a column or several columns as the expansion key to convert a row of data into multiple rows. In the previous case of converting Table A from Figure 1 to Figure 3, we use id and name as keys to expand the column self_code_list into multiple rows.

In ODPS, two built-in functions can help us easily transpose columns to rows:

TRANS_ARRAY

https://www.alibabacloud.com/help/en/maxcompute/user-guide/trans-array

LATERAL VIEW EXPLODE(column)

https://www.alibabacloud.com/help/en/maxcompute/user-guide/lateral-view

Use TRANS_ARRAY

SELECT  TRANS_ARRAY(2,',',id,name,self_code_list) AS (id,name,self_code)
  FROM  (
            SELECT  id,name
                   ,ARRAY_JOIN(FROM_JSON(JSON_FORMAT(self_code_list),"array<string>"),',')
                    AS self_code_list
              FROM    TABLE_A
              ORDER BY id ASC
  )

In Table A, the self_code_list field type is JSON, while TRANS_ARRAY requires that the type of the column to be converted into rows must be STRING, so the self_code_list should be converted into STRING type first.

We will use an example to explain the parameters of this function:

trans_array (<num_keys>, <separator>, <key1>,<key2>,…,<col1>,<col2>,<col3>) as (<key1>,<key2>,...,<col1>, <col2>)

The first parameter is the number of columns that are used as transposed keys when being converted to multiple rows. In this example, we use id and name as keys, so the value of this parameter is 2.

The second parameter is used to split a string into multiple elements, that is, the separator that changes one row into multiple elements. This parameter is set according to the specific separator, usually a comma () or semicolon (;).

The remaining parameters are column names of STRING type. The function determines that the last M columns are the array to be transposed to rows and that the first N columns are used as keys during transpose. In this example, the column name parameters are id, name, and slef_code_list. With num_key = 2, the id and name columns in the result set will be used as keys, while self_code_list is the column to be transposed.

Use LATERAL VIEW EXPLODE

SELECT  id
        ,name
        ,self_code
FROM    TABLE_A
        LATERAL VIEW EXPLODE(FROM_JSON(JSON_FORMAT(self_code_list),"array<string>")) tmp AS self_code;

Note that the input parameter of the EXPLODE function must be ARRAY.

Both methods can transpose columns to rows, but there are slight differences in handling empty columns.

Look at these original data:

SELECT id, name, self_code_list from TABLE_A
where id IN (291, 112, 116, 252); 

6
Figure 5

Convert the four pieces of data in two methods respectively.

Use TRANS_ARRAY

7
Figure 6

Use LATERAW VIEW EXPLODE

8
Figure 7

It can be seen that the result set does not retain empty rows with the LATERAW VIEW EXPLODE function, while the TRANS_ARRAY function retains empty rows.

Column to Row: the Reverse Operation of Row to Column

Since we've introduced how to transpose columns to rows, then let's focus on how to transpose rows to columns. Do you remember our initial goal? We are going to map the self_code_list of Table A to a_tag_list, as shown in Figure 8. After the previous operation to transpose the column to rows, we can easily associate with the table TABLE and obtain the temporary table shown in Figure 4.

9
Figure 8

The operation from Figure 4 to Figure 8 is to transpose rows to columns, that is, to convert multiple rows of data into one or several columns. Of course, this is not an arbitrary operation. Instead, we still need to convert it according to the key, which is similar to the process of transposing columns to rows. Transposing columns to rows is the reverse operation of transposing rows to columns. When transposing columns to rows, we split the value of a column in each row into multiple values and then into multiple rows according to the key. When transposing rows to columns, we splice a column of multiple rows of data into one data and then into a row according to the key. In the process from Figure 4 to Figure 8 in this example, we use id and name as keys to splice the column a_tag with commas (), and then use id, name, and a_tag_list to form a unique row. Of course, it can also be converted into multiple columns. We only need to specify the method to differentiate columns when splicing and perform the SPLIT operation on the column values. Then, multiple columns can be obtained. This splicing method is WM_CONCAT.

https://www.alibabacloud.com/help/en/maxcompute/user-guide/wm-concat

In the above example, we use the WM_CONCAT function like this:

SELECT  id
        ,name
        ,WM_CONCAT(',',a_tag) a_tag
from 
T_tmp_4;

Therefore, we can obtain the result set shown in Figure 8.

With the proceeding operations, we have transposed the table columns to rows and rows to columns and finally achieved our goal. For more information about how to transpose rows to columns and columns to rows, please refer to MaxCompute official documentation.


Disclaimer: The views expressed herein are for reference only and don't necessarily represent the official views of Alibaba Cloud.

0 1 0
Share on

Alibaba Cloud Community

1,014 posts | 247 followers

You may also like

Comments