All Products
Search
Document Center

MaxCompute:ORC Foreign Tables

Last Updated:Feb 07, 2026

This topic describes how to create, read, and write ORC-formatted OSS foreign tables.

Applicability

Create Foreign Tables

Syntax

If the schema of an ORC file is inconsistent with the schema of the foreign table:

  • Inconsistent number of columns: If an ORC file has fewer columns than the foreign table, the system fills the missing columns with NULL values when reading the data. If the file has more columns, the system discards the extra columns.

  • Inconsistent column types: MaxCompute supports using the STRING type to read INT data from ORC files, but this is not recommended. When using the INT type to read STRING data, the system converts string values to NULL and accepts numeric values.

Simplified Syntax

CREATE EXTERNAL TABLE [IF NOT EXISTS] <mc_oss_extable_name>
(
  <col_name> <data_type>,
  ...
)
[COMMENT <table_comment>]
[PARTITIONED BY (<col_name> <data_type>, ...)]
STORED AS orc
LOCATION '<oss_location>';

Full Syntax

CREATE EXTERNAL TABLE [IF NOT EXISTS] <mc_oss_extable_name>
(
  <col_name> <data_type>,
  ...
)
[COMMENT <table_comment>]
[PARTITIONED BY (<col_name> <data_type>, ...)]
ROW FORMAT SERDE 'org.apache.hadoop.hive.ql.io.orc.OrcSerde'
WITH serdeproperties(
    'odps.properties.rolearn'='acs:ram::<uid>:role/aliyunodpsdefaultrole'
)
STORED AS orc 
LOCATION '<oss_location>' 
tblproperties (
    '<xxx>'='<yyy>'
);

Common parameters

For more information about common parameters, see Basic syntax parameters.

Unique Parameters

with serdeproperties Properties

Property Name

Scenario

Description

Property Value

Default Value

mcfed.orc.schema.resolution

Add this property if the schemas of data in the same OSS foreign table are inconsistent.

Sets the ORC file parsing method. name indicates parsing by column name.

name

Parses by column number by default.

Equivalent to: 'mcfed.orc.schema.resolution'='position'.

tblproperties Properties

Property Name

Scenario

Description

Property Value

Default Value

mcfed.orc.compress

Add this property to write ORC data to OSS in a compressed format.

ORC compression property. Specify the compression method for ORC data.

  • SNAPPY

  • ZLIB

None

io.compression.codecs

Add this property if OSS data files are in Raw-Snappy format.

If you set this parameter to True, MaxCompute can read compressed data normally. Otherwise, MaxCompute cannot read the data.

com.aliyun.odps.io.compress.SnappyRawCodec

None

odps.external.data.output.prefix

(Compatible with odps.external.data.prefix)

Add this property to add a custom prefix to output files.

  • Contains only numbers, letters, and underscores (a-z, A-Z, 0-9, _).

  • Length is between 1 and 10.

A valid combination of characters, such as 'mc_'.

None

odps.external.data.enable.extension

Add this property to display the extension of output files.

True displays the extension of output files. False hides it.

  • True

  • False

False

odps.external.data.output.suffix

Add this property to add a custom suffix to output files.

Contains only numbers, letters, and underscores (a-z, A-Z, 0-9, _).

A valid combination of characters, such as '_hangzhou'.

None

odps.external.data.output.explicit.extension

Add this property to add a custom extension to output files.

  • Contains only numbers, letters, and underscores (a-z, A-Z, 0-9, _).

  • Length is between 1 and 10.

  • This parameter has a higher priority than odps.external.data.enable.extension.

A valid combination of characters, such as "jsonl".

None

mcfed.orc.batch.size

Controls the number of records processed each time, affecting memory usage and processing efficiency.

ORC tuning property. Defines the default ORC batch size in rows.

Non-negative integer

1000

Write data

For more information about the MaxCompute write syntax, see Write syntax.

Query analysis

  • For more information about the SELECT syntax, see Query syntax.

  • For more information about query plan optimization, see Query optimization.

  • You can optimize queries on ORC foreign tables by enabling Predicate Pushdown (PPD). To do this, add the following parameters before your SQL statement:

    The PPD parameters must be used in Native mode. This means the Native switch must be set to true.

    -- Enable the ORC native reader.
    SET odps.ext.oss.orc.native-true;
    
    -- Enable ORC PPD.
    SET odps.storage.orc.use.predicate.pushdown=true; 

Scenario example

This example shows how to create a foreign ORC table that uses SNAPPY compression, and then read data from and write data to the table.

  1. Prerequisites

    1. You have created a MaxCompute project.

    2. You have prepared an OSS bucket and OSS directory. For more information, see Create a bucket and Manage folders.

      Because MaxCompute is available only in specific regions, cross-region connectivity issues may occur. We recommend using an OSS bucket in the same region as your MaxCompute project.
    3. Authorization

      1. You have permissions to access OSS. An Alibaba Cloud account (primary account), Resource Access Management (RAM) user, or RAM role can access OSS external tables. For authorization details, see STS-mode authorization for OSS.

      2. You have CreateTable permissions in the MaxCompute project. For details about table operation permissions, see MaxCompute permissions.

  2. Prepare a SNAPPY-formatted data file.

    Using the provided sample data, create the folder path orc_snappy/dt=20250526 in the oss-mc-test bucket. Upload the SNAPPY file to the dt=20250526 partition folder.

  3. Create a foreign ORC table with SNAPPY compression.

    CREATE EXTERNAL TABLE orc_data_type_snappy
    (
        vehicleId INT,
        recordId INT,
        patientId INT,
        calls INT,
        locationLatitute DOUBLE,
        locationLongitude DOUBLE,
        recordTime STRING,
        direction STRING
    )
    PARTITIONED BY (dt STRING )
    ROW FORMAT SERDE 'org.apache.hadoop.hive.ql.io.orc.OrcSerde'
    WITH serdeproperties (
     'odps.properties.rolearn'='acs:ram::<uid>:role/aliyunodpsdefaultrole'
    ) 
    STORED AS ORC  
    LOCATION 'oss://oss-cn-hangzhou-internal.aliyuncs.com/oss-mc-test/orc_snappy/'
    tblproperties (
        'mcfed.orc.compress'='SNAPPY');
  4. If your OSS foreign table is partitioned, run the following command to add existing partitions. For more information, see Syntax for adding partitions to an OSS external table.

    -- Add partitions to the external table
    MSCK REPAIR TABLE orc_data_type_snappy ADD PARTITIONS;
  5. Read data from the ORC foreign table.

    SELECT * FROM orc_data_type_snappy WHERE dt=20250526 LIMIT 10;

    The query returns the following result:

    +------------+------------+------------+------------+------------------+-------------------+----------------+------------+------------+
    | vehicleid  | recordid   | patientid  | calls      | locationlatitute | locationlongitude | recordtime     | direction  | dt         |
    +------------+------------+------------+------------+------------------+-------------------+----------------+------------+------------+
    | 1          | 12         | 76         | 1          | 46.81006         | -92.08174         | 9/14/2014 0:10 | SW         | 20250526   |
    | 1          | 1          | 51         | 1          | 46.81006         | -92.08174         | 9/14/2014 0:00 | S          | 20250526   |
    | 1          | 2          | 13         | 1          | 46.81006         | -92.08174         | 9/14/2014 0:01 | NE         | 20250526   |
    | 1          | 3          | 48         | 1          | 46.81006         | -92.08174         | 9/14/2014 0:02 | NE         | 20250526   |
    | 1          | 4          | 30         | 1          | 46.81006         | -92.08174         | 9/14/2014 0:03 | W          | 20250526   |
    | 1          | 5          | 47         | 1          | 46.81006         | -92.08174         | 9/14/2014 0:04 | S          | 20250526   |
    | 1          | 6          | 9          | 1          | 46.81006         | -92.08174         | 9/14/2014 0:05 | S          | 20250526   |
    | 1          | 7          | 53         | 1          | 46.81006         | -92.08174         | 9/14/2014 0:06 | N          | 20250526   |
    | 1          | 8          | 63         | 1          | 46.81006         | -92.08174         | 9/14/2014 0:07 | SW         | 20250526   |
    | 1          | 9          | 4          | 1          | 46.81006         | -92.08174         | 9/14/2014 0:08 | NE         | 20250526   |
    | 1          | 10         | 31         | 1          | 46.81006         | -92.08174         | 9/14/2014 0:09 | N          | 20250526   |
    +------------+------------+------------+------------+------------------+-------------------+----------------+------------+------------+
  6. Write data to the ORC foreign table and query the data.

    INSERT INTO orc_data_type_snappy PARTITION (dt ='20250526') 
      VALUES (1,16,76,1,46.81006,-92.08174,'9/14/2014 0:10','SW');
    
    -- Query the newly inserted data
    SELECT * FROM orc_data_type_snappy WHERE dt = '20250526' AND recordid=16;

    The query returns the following result:

    +------------+------------+------------+------------+------------------+-------------------+----------------+------------+------------+
    | vehicleid  | recordid   | patientid  | calls      | locationlatitute | locationlongitude | recordtime     | direction  | dt         |
    +------------+------------+------------+------------+------------------+-------------------+----------------+------------+------------+
    | 1          | 16         | 76         | 1          | 46.81006         | -92.08174         | 9/14/2014 0:10 | SW         | 20250526   |
    +------------+------------+------------+------------+------------------+-------------------+----------------+------------+------------+

Supported data types

For more information about MaxCompute data types, see Data type version 1.0 and Data type version 2.0.

  • JNI mode (set odps.ext.oss.orc.native=false;): This mode does not use the native ORC reader when reading tables. It supports both read and write operations.

  • Native mode (set odps.ext.oss.orc.native=true;): This mode uses the native ORC reader when reading tables. It supports only read operations.

Mode

Java mode (read and write)

Native mode (read-only)

TINYINT

Supported

Supported

SMALLINT

Supported

Supported

INT

Supported

Supported

BIGINT

Supported

Supported

BINARY

Supported

Supported

FLOAT

Supported

Supported

DOUBLE

Supported

Supported

DECIMAL(precision,scale)

Supported

Supported

VARCHAR(n)

Supported

Supported

CHAR(n)

Supported

Supported

STRING

Supported

Supported

DATE

Supported

Supported

DATETIME

Not supported

Supported

TIMESTAMP

Not supported

Not supported

TIMESTAMP_NTZ

Supported

Not supported

BOOLEAN

Supported

Supported

ARRAY

Supported

Supported

MAP

Supported

Supported

STRUCT

Supported

Supported

JSON

Not supported

Not supported

Supported Compression Formats

  • To read or write compressed OSS files, you must add the with serdeproperties configuration to the table creation statement. For more information, see with serdeproperties property parameters.

  • MaxCompute supports reading and writing ORC files that are compressed with SNAPPY or ZLIB.

Support Schema Evolution

ORC foreign tables support two methods for mapping the table schema to the columns in the data file: position-based mapping and name-based mapping.

  • Position-based mapping: To use this method, set the 'mcfed.orc.schema.resolution'='position' parameter or omit it to use the default setting. The system maps columns based on their position. Therefore, the column order in the table must exactly match the field order in the file.

  • Name-based mapping: To use this method, set the 'mcfed.orc.schema.resolution'='name' parameter when you create the foreign table. The system then maps columns by name instead of by position.

In the following table, Data compatibility issues describes whether a foreign table can correctly read data after a Schema Evolution operation. This includes reading new data that conforms to the modified schema and historical data that uses the old schema.

Operation type

Mapping method

Supported

Description

Data compatibility issues

Add column

Position-based mapping

Supported

  • New columns cannot be added in a specified order; they are added as the last column by default.

  • Adding a regular column with a default value only takes effect for data written from MaxCompute.

  • Data that conforms to the modified schema structure can be read correctly.

  • Historical data with the old schema, where no column modification operation was performed, is read by the table using the new schema.

    For example, if a column is added to a table, but the corresponding column is not added to historical data, the historical data for that column will be filled with NULL when the table is read.

Name-based mapping

Supported

Delete column

Position-based mapping

Not supported

Not recommended. ORC foreign tables map column values by position. The column order of the table must match the field order in the file. After a column deletion operation, if the file and table fields do not match, an error occurs when reading the table.

  • Data that conforms to the modified schema structure can be read correctly.

  • Historical data with the old schema, where no column modification operation was performed, is read by the table using the new schema.

    For example, if a column is deleted from a table, but the corresponding column is not deleted from historical data, an error occurs when reading the table.

Name-based mapping

Supported

With name-based mapping, the system automatically matches based on column names, no longer relying on order.

Compatible

Modify column order

Position-based mapping

Not supported

Not recommended. ORC foreign tables map column values by position. The column order of the table must match the field order in the file. After a column deletion operation, if the file and table fields do not match, an error occurs when reading the table.

  • Data that conforms to the modified schema structure can be read correctly.

  • Historical data with the old schema, where no column modification operation was performed, is read by the table using the new schema.

    For example, if the column order of a table is modified, but the column order of historical data is not modified, the schema and data will not correspond.

Name-based mapping

Supported

With name-based mapping, the system automatically matches based on column names, no longer relying on order.

Compatible

Change column data type

Position-based mapping

Supported

For more information about the conversion table for data types, see Change column data type.

Compatible

Name-based mapping

Supported

Modify column name

Position-based mapping

Supported

Compatible

Name-based mapping

Not supported

Not recommended. With name-based mapping, the system automatically matches based on column names. After modifying a column name, the original column name that could be matched might not be found in the file.

  • Data that conforms to the modified schema structure can be read correctly.

  • If you do not perform a column modification operation on the existing data, the table is read using the new schema.

    For example, if a column name is modified in a table, but the corresponding column name is not modified in the ORC file schema, that column will be empty when the table is read.

Modify column comment

Position-based mapping

Supported

The comment content must be a valid string with a length not exceeding 1024 bytes; otherwise, an error is reported.

Compatible

Name-based mapping

Supported

Modify column's nullability property

Position-based mapping

Not supported

This operation is not supported. It is Nullable by default.

Not applicable

Name-based mapping

Not supported