All Products
Search
Document Center

MaxCompute:MaxFrame-specific APIs

Last Updated:Oct 16, 2025

This topic describes the APIs that are specific to MaxFrame, including Session, Input/Output, Execute, and Fetch. These APIs provide a convenient way to process data in MaxFrame tasks.

Session

new_session

  • API name: new_session. For more information about the source code, see new_session.

    new_session(
      session_id: str = None,
      default: bool = True,
      new: bool = True,
      odps_entry: Optional[ODPS] = None
    )
  • Description: starts a MaxFrame task session.

  • Input parameters

    Parameter

    Data type

    Required

    Description

    session_id

    String

    No

    The session identifier.

    This parameter is used to specify a unique identifier for a new session. If this parameter is not specified, MaxFrame automatically generates a default identifier.

    default

    Boolean

    No

    Specifies whether to use the created session as the default session.

    Default value: True.

    new

    Boolean

    No

    Specifies whether to create a session.

    Default value: True. If this parameter is set to False, an existing session is reused based on session_id.

    odps_entry

    ODPS

    Yes

    The MaxCompute entry object. For more information, see Create a MaxCompute entry point.

  • Return value

    The session object.

  • Sample code

    from maxframe import new_session
    from odps import ODPS
    
    # Use the MaxFrame account to initialize MaxCompute.
    o = ODPS(
        # Set the environment variable ALIBABA_CLOUD_ACCESS_KEY_ID to the AccessKey ID of your Alibaba Cloud account. 
        # Set the environment variable ALIBABA_CLOUD_ACCESS_KEY_SECRET to the AccessKey secret of your Alibaba Cloud account. 
        # We recommend that you do not directly use the actual AccessKey ID and AccessKey secret. 
        os.getenv('ALIBABA_CLOUD_ACCESS_KEY_ID'),
        os.getenv('ALIBABA_CLOUD_ACCESS_KEY_SECRET'),
        project='your-default-project',
        endpoint='your-end-point',
    )
    
    # Initialize the MaxFrame session.
    session = new_session(odps_entry=o)

Input/Output-related APIs

read_odps_table

  • API name: read_odps_table. For more information about the source code, see read_odps_table.

    read_odps_table(
      table_name: Union[str, Table],
      partitions: Union[None, str, List[str]] = None,
      columns: Optional[List[str]] = None,
      index_col: Union[None, str, List[str]] = None,
      odps_entry: ODPS = None,
      string_as_binary: bool = None,
      append_partitions: bool = False
    )
  • Description: reads data from a MaxCompute table and builds a DataFrame object. You can specify specific columns as indexes. If you do not specify indexes, a RangeIndex is generated.

  • Input parameters

    Parameter

    Data type

    Required

    Description

    table_name

    String/Table

    Yes

    The name of the MaxCompute table or table object from which you want to read data.

    partitions

    String/List

    No

    The table partition or partition list from which you want to read data.

    The format is <partition_name>=<partition_value>. If you do not specify this parameter, data from all partitions in the table is read.

    columns

    List

    No

    The names of columns from which you want to read data.

    The format is <column1>, <column2>, .... If you do not specify this parameter, data from all columns except partition key columns is read.

    index_col

    String/List

    No

    The names of columns that are used as indexes.

    odps_entry

    ODPS

    No

    The ODPS entry object. For more information, see Initialize an ODPS entry point.

    string_as_binary

    Boolean

    No

    Specifies whether to read string data in the binary form.

    append_partitions

    Boolean

    No

    Specifies whether to read data from partition key columns.

    The default value is False. If this parameter is set to True and the columns parameter is not specified, data from all columns, including partition key columns, is read.

  • Return value

    The DataFrame object.

  • Sample code

    import maxframe.dataframe as md
    
    df = md.read_odps_table('BIGDATA_PUBLIC_DATASET.data_science.maxframe_ml_100k_users', index_col='user_id', columns=['age', 'sex'])
    print(df.execute().fetch())
    
    # Return value		
    user_id	age   sex
    1	24	M
    2	53	F
    3	23	M
    4	24	M
    5	33	F
    ...	...	...
    939	26	F
    940	32	M
    941	20	M
    942	48	F
    943	22	M

read_odps_query

  • API name: read_odps_query. For more information about the source code, see read_odps_query.

    read_odps_query(
      query: str,
      odps_entry: ODPS = None,
      index_col: Union[None, str, List[str]] = None,
      string_as_binary: bool = None
      )
  • Description: reads data from a MaxCompute SQL query and creates a DataFrame object. You can specify specific columns as indexes. If you do not specify indexes, a RangeIndex is generated.

  • Input parameters

    Parameter

    Data type

    Required

    Description

    query

    String

    Yes

    The MaxCompute SQL statement that you want to read.

    odps_entry

    ODPS

    No

    The MaxCompute entry object. For more information, see Create a MaxCompute entry point.

    index_col

    String/List

    No

    The names of columns that are used as indexes.

    string_as_binary

    Boolean

    No

    Specifies whether to read string data in the binary form.

  • Return value

    The DataFrame object.

  • Sample code

    import maxframe.dataframe as md
    
    df = md.read_odps_query('select user_id, age, sex FROM `BIGDATA_PUBLIC_DATASET.data_science.maxframe_ml_100k_users`')

to_odps_table

  • API name: to_odps_table. For more information about the source code, see to_odps_table.

    to_odps_table(
      table: Union[Table, str],
      partition: Optional[str] = None,
      partition_col: Union[None, str, List[str]] = None,
      overwrite: bool = False,
      unknown_as_string: Optional[bool] = None,
      index: bool = True,
      index_label: Union[None, str, List[str]] = None,
      lifecycle: Optional[int] = None
    )
  • Description: writes a DataFrame object to a MaxCompute table. If the table does not exist in MaxCompute, MaxFrame automatically creates the table.

  • Input parameters

    Parameter

    Data type

    Required

    Description

    table

    String/Table

    Yes

    The name of the table or table object to which you want to write DataFrame data.

    partition

    String

    No

    The partition to which you want to write data.

    For example, pt1=xxx, pt2=yyy.

    partition_col

    String/List

    No

    The names of columns that are used as partition key columns in DataFrame.

    overwrite

    Boolean

    No

    Specifies whether to overwrite data if the table or partition already exists.

    Default value: False.

    unknown_as_string

    Boolean

    No

    Specifies whether to process data of an unrecognized type as the STRING data type.

    Default value: False. If this parameter is set to True, the object type in DataFrame is processed as the STRING data type. An error may occur.

    index

    Boolean

    No

    Specifies whether to store indexes.

    Default value: True.

    index_label

    String/List

    No

    The name of the column specified for the index.

    The name of an index column is specified by the index_label parameter. If you do not specify this parameter, the default name `index` is used. For a single-level index, the name defaults to `index`. For a multi-level index, the names are `level_x`, where x is the level of the index.

    lifecycle

    int

    No

    The lifecycle of the output table.

    The value of this parameter is a positive integer. If the table already exists, the setting of this parameter overwrites the original parameter setting.

  • Return value

    The DataFrame object.

  • Example

    import maxframe.dataframe as md
    
    df = md.read_odps_query('select user_id, age, sex FROM `BIGDATA_PUBLIC_DATASET.data_science.maxframe_ml_100k_users`', index_col='user_id'))
    ouput_df = df.to_odps_table('output_table', lifecycle = 7)

to_odps_model

  • API name: to_odps_model.

    to_odps_model(
        model_name: str,
        model_version: str = None,
        schema: str = None,
        project: str = None,
        description: Optional[str] = None,
        version_description: Optional[str] = None,
        create_model: bool = True,
        set_default_version: bool = False
    )
  • Description: Saves an XGBoost model that is trained by a MaxFrame job as a MaxCompute model object.

  • Input parameters

    Parameter

    Data type

    Required

    Description

    model_name

    String

    Yes

    The model name.

    • If project and schema are specified separately in the job, specify only the model name. Otherwise, specify the model name in the project.schema.model format.

    model_version

    String

    No

    The model version.

    • If you do not specify this parameter, the system automatically generates a version.

    schema

    String

    No

    The schema to which the model belongs.

    • If you do not specify this parameter, the default schema is "default".

    project

    String

    No

    The project to which the model belongs.

    description

    String

    No

    The model description.

    version_description

    String

    No

    The description of the model version.

    create_model

    Boolean

    No

    Specifies whether to automatically create the model if it does not exist.

    Default value: True.

    set_default_version

    Boolean

    No

    Specifies whether to set the current version as the default version of the model.

    Default value: False.

  • Return value

    A Scalar object. You can call .execute() to trigger the model saving operation.

  • Example

    # Train an XGBoost model
    from maxframe.learn.contrib.xgboost import XGBClassifier
    X_df = md.DataFrame(X, columns=cols)
    clf = XGBClassifier(n_estimators=10)
    clf.fit(X_df, y)
    
    # Save the model to MaxCompute
    clf.to_odps_model(
        model_name="my_model",
        # If you specify a project and schema, the format of model_name is as follows:
        # model_name="project.schema.my_model"
        model_version="version1"
    ).execute()

Execute

execute

  • API name: execute. For more information about the source code, see execute.

    execute(
      session: SessionType = None
    )
  • Description: calls the execute method to start a data processing task.

  • Input parameters

    Parameter

    Data type

    Required

    Description

    session

    Session

    No

    The session that is used to run a data processing task. For more information about how to create a session, see new_session.

    If this parameter is not specified, the global session initialized using new_session is used.

  • Return value

    N/A.

  • Example

    import maxframe.dataframe as md
    
    df = md.read_odps_query('select user_id, age, sex FROM BIGDATA_PUBLIC_DATASET.data_science.maxframe_ml_100k_users', index_col='user_id'))
    df.execute()

Fetch

fetch

  • API name: fetch. For more information about the source code, see fetch.

    fetch(
      session: SessionType = None
    )
  • Description: returns the result data to the on-premises environment.

  • Input parameters

    Parameter

    Data type

    Required

    Description

    session

    Session

    No

    The session that is used to obtain the result data. For more information about how to create a session, see new_session.

    If this parameter is not specified, the global session initialized using new_session is used.

  • Return value

    The DataFrame or Series of Pandas.

  • Sample code

    import maxframe.dataframe as md
    
    df = md.read_odps_query('select user_id, age, sex FROM `BIGDATA_PUBLIC_DATASET.data_science.maxframe_ml_100k_users`', index_col='user_id')
    res = df.execute().fetch()
    print(res)
    
    # Obtain the returned result.      
    user_id   age  sex
    1         24   M
    2         53   F
    3         23   M
    4         24   M
    5         33   F
    ...      ...  ..
    939       26   F
    940       32   M
    941       20   M
    942       48   F
    943       22   M
    
    [943 rows x 2 columns]