Object Storage Service (OSS) allows you to upload the data of tables that use different schemas to an OSS directory. You can create OSS external tables in MaxCompute to read data from and write data to OSS based on field names. This topic describes how to use an OSS external table to read data from and write data to OSS based on field names.

Prerequisites

  • MaxCompute is activated. A MaxCompute project is created. For more information about how to create a MaxCompute project, see Create a MaxCompute project.
  • OSS is activated and a bucket is created. For more information about how to create a bucket, see Create buckets.

Perform data read and write operations based on field names

  1. Upload data in the Optimized Row Columnar (ORC) format to a specified OSS directory.
    In this example, a directory named demo7 is created in OSS and two data files are uploaded to the directory.
    Note The OSS directory and data files in this example are only for reference. The actual directory and files may vary.
    • orc_file_with_schema0
      -- Field information
      `#id` bigint,`#name` stirng
      -- Data
      {`#id`:1,`#name`:"a"}
      {`#id`:2,`#name`:null}
    • orc_file_with_schema1
      -- Field information
      age bigint,`#name` string,`#id` string
      -- Data
      {age:10,`#name`:"c",`#id`:"3"}
      {age:20,`#name`:"d",`#id`:"4"}
  2. Log on to a tool for MaxCompute connection and development.
    The following table describes the tools that you can use to create an OSS external table.
    Creation method Tool
    Execute MaxCompute SQL statements MaxCompute client
    Query editor in the MaxCompute console
    ODPS SQL nodes in the DataWorks console
    SQL scripts in MaxCompute Studio
    Perform operations on the web UI SQL scripts in MaxCompute Studio
    DataWorks console
  3. Create an OSS external table in MaxCompute.
    Execute the following sample statement to create an OSS external table. For more information about the syntax of the CREATE EXTERNAL TABLE statement, see Create an OSS external table.
    CREATE EXTERNAL TABLE mf_oss_diff_schm
    (
        `#name` string, `#id` string, age bigint
    )
    ROW FORMAT SERDE 'org.apache.hadoop.hive.ql.io.orc.OrcSerde'
    WITH SERDEPROPERTIES (
        'mcfed.orc.schema.resolution'='name'
    )
    STORED AS ORC
    LOCATION 'oss://oss-cn-beijing-internal.aliyuncs.com/mfoss*******/demo7/';
  4. Query data from the OSS external table.
    set odps.ext.oss.orc.native=true;
    select * from mf_oss_diff_schm;
    The following result is returned:
    +------------+------------+------------+
    | #name      | #id        | age        |
    +------------+------------+------------+
    | a          | 1          | NULL       |
    | NULL       | 2          | NULL       |
    | c          | 3          | 10         |
    | d          | 4          | 20         |
    +------------+------------+------------+
  5. Write data to the OSS external table and query data from the OSS external table.
    insert into table mf_oss_diff_schm values("e","5",30);
    
    -- Query data.
    select * from mf_oss_diff_schm;
    -- The following result is returned:
    +------------+------------+------------+
    | #name      | #id        | age        |
    +------------+------------+------------+
    | a          | 1          | NULL       |
    | NULL       | 2          | NULL       |
    | e          | 5          | 30         |
    | c          | 3          | 10         |
    | d          | 4          | 20         |
    +------------+------------+------------+