All Products
Search
Document Center

MaxCompute:ORC foreign tables

Last Updated:Jan 01, 2026

This topic describes how to create, read from, and write to OSS foreign tables in ORC format.

Scope

  • OSS external tables do not support the cluster property.

  • A single file cannot exceed 2 GB. If a file is too large, split it.

  • MaxCompute and OSS must be in the same region.

Supported data types

For more information about MaxCompute data types, see Data Types Version 1.0 and Data Types Version 2.0.

  • Java Native Interface (JNI) mode (the Native ORC Reader is not used to read tables): set odps.ext.oss.orc.native=false;. Both read and write operations are supported.

  • Native mode (the Native ORC Reader is used to read tables): set odps.ext.oss.orc.native=true;. Only read operations are supported.

Pattern

Java mode (read/write)

Native mode (read-only)

TINYINT

已开通

已开通

SMALLINT

已开通

已开通

INT

已开通

已开通

BIGINT

已开通

已开通

BINARY

已开通

已开通

FLOAT

已开通

已开通

DOUBLE

已开通

已开通

DECIMAL(precision,scale)

已开通

已开通

VARCHAR(n)

已开通

已开通

CHAR(n)

已开通

已开通

STRING

已开通

已开通

DATE

已开通

已开通

DATETIME

未开通

已开通

TIMESTAMP

未开通

未开通

TIMESTAMP_NTZ

已开通

未开通

BOOLEAN

已开通

已开通

ARRAY

已开通

已开通

MAP

已开通

已开通

STRUCT

已开通

已开通

JSON

未开通

未开通

Supported compression formats

  • To read from or write to compressed OSS files, you must add the with serdeproperties property to the CREATE TABLE statement. For more information, see Parameters for WITH SERDEPROPERTIES.

  • You can read from and write to ORC files that are compressed using SNAPPY or ZLIB.

Create a foreign table

Syntax

If the schema in the ORC file is inconsistent with the foreign table schema:

  • Inconsistent number of columns: If an ORC file has fewer columns than the foreign table defined in the Data Definition Language (DDL) statement, MaxCompute fills the missing columns with NULL values when reading the data. If the ORC file has more columns, the extra columns are discarded.

  • Inconsistent column types: MaxCompute lets you read INT data from an ORC file as the STRING type, but this practice is not recommended. When you attempt to read STRING data as the INT type, string values are converted to NULL, whereas numeric values are processed correctly.

Basic syntax

CREATE EXTERNAL TABLE [IF NOT EXISTS] <mc_oss_extable_name>
(
  <col_name> <data_type>,
  ...
)
[COMMENT <table_comment>]
[PARTITIONED BY (<col_name> <data_type>, ...)]
STORED AS orc
LOCATION '<oss_location>';

Full syntax

CREATE EXTERNAL TABLE [IF NOT EXISTS] <mc_oss_extable_name>
(
  <col_name> <data_type>,
  ...
)
[COMMENT <table_comment>]
[PARTITIONED BY (<col_name> <data_type>, ...)]
ROW FORMAT SERDE 'org.apache.hadoop.hive.ql.io.orc.OrcSerde'
WITH serdeproperties(
    'odps.properties.rolearn'='acs:ram::<uid>:role/aliyunodpsdefaultrole'
)
STORED AS orc 
LOCATION '<oss_location>' 
tblproperties (
    '<xxx>'='<yyy>'
);

Common parameters

For information about common parameters, see Basic syntax parameters.

Unique parameters

Parameters for WITH SERDEPROPERTIES

property_name

Scenario

Description

property_value

Default value

mcfed.orc.schema.resolution

Add this property when the data in the same OSS foreign table has different schemas.

Specifies the parsing method for ORC files. The value name indicates that files are parsed by column name.

name

By column position.

By column position. This is equivalent to setting 'mcfed.orc.schema.resolution'='position'.

Parameters for TBLPROPERTIES

property_name

Scenario

Description

property_value

Default value

mcfed.orc.compress

Add this property to write ORC data to OSS in a compressed format.

ORC compression property. Specifies the compression method for ORC data.

  • SNAPPY

  • ZLIB

None

io.compression.codecs

Add this property if the OSS data file is in Raw-Snappy format.

If you set this parameter to `True`, MaxCompute can read compressed data. Otherwise, the read operation fails.

com.aliyun.odps.io.compress.SnappyRawCodec

None

odps.external.data.output.prefix

(Compatible with odps.external.data.prefix)

Add this property to specify a custom prefix for the names of output files.

  • The prefix can contain only digits, letters, and underscores (_).

  • The length must be from 1 to 10 characters.

A valid string, such as 'mc_'.

None

odps.external.data.enable.extension

Add this property to display the extension of output files.

If this property is set to True, the file extension is displayed. Otherwise, the file extension is not displayed.

  • True

  • False

False

odps.external.data.output.suffix

Add this property to specify a custom suffix for the names of output files.

The suffix can contain only digits, letters, and underscores (_).

A valid string, such as '_hangzhou'.

None

odps.external.data.output.explicit.extension

Add this property to specify a custom extension for output files.

  • The extension can contain only digits, letters, and underscores (_).

  • The length must be from 1 to 10 characters.

  • This property has a higher priority than the odps.external.data.enable.extension property.

A valid string, such as "jsonl".

None

mcfed.orc.batch.size

Controls the number of records processed at a time. This affects memory usage and processing efficiency.

ORC tuning property. Defines the default batch size for ORC in rows.

A non-negative integer

1000

Write data

For more information about the write syntax in MaxCompute, see Write syntax.

Query and analysis

  • For more information about the SELECT syntax, see Query syntax.

  • For more information about optimizing query plans, see Query optimization.

  • ORC foreign tables support query optimization by enabling predicate pushdown (PPD). To enable PPD, add the following parameters before your SQL statement:

    The PPD parameter is used only in Native mode, which requires the Native switch to be set to true.

    -- Enable the ORC native reader.
    SET odps.ext.oss.orc.native=true;
    
    -- Enable ORC PPD.
    SET odps.storage.orc.use.predicate.pushdown=true; 

Scenarios

This section describes how to create an ORC foreign table that uses SNAPPY compression, and then read data from and write data to the table.

  1. Prerequisites

    1. You have created a MaxCompute project.

    2. Prepare an OSS bucket and folder. For more information, see Create a bucket and Manage folders.

      MaxCompute is deployed only in some regions. To prevent data connectivity issues, use an OSS bucket in the same region as your MaxCompute project.
    3. Authorization

      1. You must have permissions to access OSS. You can use an Alibaba Cloud account, a Resource Access Management (RAM) user, or a RAM role to access OSS foreign tables. For more information about authorization, see STS authorization for OSS.

      2. You must have the CreateTable permission in the MaxCompute project. For more information about table operation permissions, see MaxCompute permissions.

  2. Prepare a data file in SNAPPY format.

    In the oss-mc-test bucket for the Sample data, create the orc_snappy/dt=20250526 folder hierarchy, and upload the snappy file to the dt=20250526 partition folder.

  3. Create an ORC foreign table with SNAPPY compression.

    CREATE EXTERNAL TABLE orc_data_type_snappy
    (
        vehicleId INT,
        recordId INT,
        patientId INT,
        calls INT,
        locationLatitute DOUBLE,
        locationLongitude DOUBLE,
        recordTime STRING,
        direction STRING
    )
    PARTITIONED BY (dt STRING )
    ROW FORMAT SERDE 'org.apache.hadoop.hive.ql.io.orc.OrcSerde'
    WITH serdeproperties (
     'odps.properties.rolearn'='acs:ram::<uid>:role/aliyunodpsdefaultrole'
    ) 
    STORED AS ORC  
    LOCATION 'oss://oss-cn-hangzhou-internal.aliyuncs.com/oss-mc-test/orc_snappy/'
    tblproperties (
        'mcfed.orc.compress'='SNAPPY');
  4. Import partition data. If an OSS foreign table is a partitioned table, you must perform an additional step to import the partition data. For more information, see Syntax for adding partition data to an OSS foreign table.

    -- Import partition data
    MSCK REPAIR TABLE orc_data_type_snappy ADD PARTITIONS;
  5. Read data from the ORC foreign table.

    SELECT * FROM orc_data_type_snappy WHERE dt=20250526 LIMIT 10;

    The following result is returned:

    +------------+------------+------------+------------+------------------+-------------------+----------------+------------+------------+
    | vehicleid  | recordid   | patientid  | calls      | locationlatitute | locationlongitude | recordtime     | direction  | dt         |
    +------------+------------+------------+------------+------------------+-------------------+----------------+------------+------------+
    | 1          | 12         | 76         | 1          | 46.81006         | -92.08174         | 9/14/2014 0:10 | SW         | 20250526   |
    | 1          | 1          | 51         | 1          | 46.81006         | -92.08174         | 9/14/2014 0:00 | S          | 20250526   |
    | 1          | 2          | 13         | 1          | 46.81006         | -92.08174         | 9/14/2014 0:01 | NE         | 20250526   |
    | 1          | 3          | 48         | 1          | 46.81006         | -92.08174         | 9/14/2014 0:02 | NE         | 20250526   |
    | 1          | 4          | 30         | 1          | 46.81006         | -92.08174         | 9/14/2014 0:03 | W          | 20250526   |
    | 1          | 5          | 47         | 1          | 46.81006         | -92.08174         | 9/14/2014 0:04 | S          | 20250526   |
    | 1          | 6          | 9          | 1          | 46.81006         | -92.08174         | 9/14/2014 0:05 | S          | 20250526   |
    | 1          | 7          | 53         | 1          | 46.81006         | -92.08174         | 9/14/2014 0:06 | N          | 20250526   |
    | 1          | 8          | 63         | 1          | 46.81006         | -92.08174         | 9/14/2014 0:07 | SW         | 20250526   |
    | 1          | 9          | 4          | 1          | 46.81006         | -92.08174         | 9/14/2014 0:08 | NE         | 20250526   |
    | 1          | 10         | 31         | 1          | 46.81006         | -92.08174         | 9/14/2014 0:09 | N          | 20250526   |
    +------------+------------+------------+------------+------------------+-------------------+----------------+------------+------------+
  6. Write data to the ORC foreign table and query the data.

    INSERT INTO orc_data_type_snappy PARTITION (dt ='20250526') 
      VALUES (1,16,76,1,46.81006,-92.08174,'9/14/2014 0:10','SW');
    
    -- Query the newly written data
    SELECT * FROM orc_data_type_snappy WHERE dt = '20250526' AND recordid=16;

    The following result is returned:

    +------------+------------+------------+------------+------------------+-------------------+----------------+------------+------------+
    | vehicleid  | recordid   | patientid  | calls      | locationlatitute | locationlongitude | recordtime     | direction  | dt         |
    +------------+------------+------------+------------+------------------+-------------------+----------------+------------+------------+
    | 1          | 16         | 76         | 1          | 46.81006         | -92.08174         | 9/14/2014 0:10 | SW         | 20250526   |
    +------------+------------+------------+------------+------------------+-------------------+----------------+------------+------------+