You can use MTable Expander to expand an MTable to a table.
Limits
This component supports the following compute engines: MaxCompute, Flink, and Deep Learning Containers (DLC).
Configure the component in the Platform for AI (PAI) console
Input ports
Input port (left-to-right)
Data type
Recommended upstream component
Required
data
None
Yes
Component parameters
Tab
Parameter
Description
Field Setting
selectedCol
The name of the computed column. The value of this parameter is of the STRING type in the MTABLE format.
reservedCols
The columns to be reserved by the algorithm.
Parameters Setting
Schema
The name and type of the expanded column. The value of this parameter is in the
colname coltype[, colname2, coltype2[, ...]]
format. For example, the value isf0 string, f1 bigint, f2 double
.handleInvalidMethod
The method used to handle invalid values. Valid values:
ERROR: An error is returned. This is the default value.
SKIP: Invalid values are skipped.
Execution Tuning
Number of Workers
The number of workers. This parameter must be used together with the Memory per worker, unit MB parameter. The value of this parameter must be a positive integer. Valid values: [1,9999].
Memory per worker, unit MB
The memory size of each worker. Valid values: 1024 to 64 × 1024. Unit: MB.
Configure the component by coding
You can copy the following code to the code editor of the PyAlink Script component. This allows the PyAlink Script component to function like the MTable Expander component.
import numpy as np
import pandas as pd
from pyalink.alink import *
df_data = pd.DataFrame([
["a1", "11L", 2.2],
["a1", "12L", 2.0],
["a2", "11L", 2.0],
["a2", "12L", 2.0],
["a3", "12L", 2.0],
["a3", "13L", 2.0],
["a4", "13L", 2.0],
["a4", "14L", 2.0],
["a5", "14L", 2.0],
["a5", "15L", 2.0],
["a6", "15L", 2.0],
["a6", "16L", 2.0]
])
input = BatchOperator.fromDataframe(df_data, schemaStr='id string, f0 string, f1 double')
zip = GroupByBatchOp()\
.setGroupByPredicate("id")\
.setSelectClause("id, mtable_agg(f0, f1) as m_table_col")
flatten = FlattenMTableBatchOp()\
.setReservedCols(["id"])\
.setSelectedCol("m_table_col")\
.setSchemaStr('f0 string, f1 int')
zip.linkFrom(input).link(flatten).print()