Parameter | Description | Required | Default Value |
datasource | The name of the data source. It must be the same as the name of the added data source. You can add data sources by using the code editor. | Yes | No default value |
object | The prefix for the names of the files that you want to write to OSS. OSS simulates the directory effect by adding delimiters to file names. Examples: If you set the object parameter to datax, the names of the files start with datax and end with random strings. If you set the object parameter to cdo/datax, the names of the files start with /cdo/datax and end with random strings. OSS uses forward slashes (/) in file names to simulate the directory effect.
If you do not want to add a random universally unique identifier (UUID) as the suffix, we recommend that you set the writeSingleObject parameter to true. For more information, see the description of the writeSingleObject parameter. | Yes | No default value |
writeMode | The write mode. Valid values: truncate: OSS Writer deletes all existing objects whose names start with the specified prefix before it writes files to OSS. For example, if you set the object parameter to abc, OSS Writer deletes all the objects whose names start with abc before it writes files to OSS. append: OSS Writer writes all files to OSS and suffixes the file names with random UUIDs to ensure that the names of the files are different from the names of existing objects. For example, if you set the object parameter to DI, the actual names of the files written to OSS are in the DI_****_****_**** format. nonConflict: If OSS contains objects whose names start with the specified prefix, OSS Writer returns an error. For example, if you set the object parameter to abc and OSS contains an object named abc123, OSS Writer returns an error.
| Yes | No default value |
writeSingleObject | Specifies whether to write a single file to OSS at a time. Valid values: true: writes a single file to OSS at a time. If no data is read, no empty file is generated. false: writes multiple files to OSS at a time. If no data is read and a file header is configured, an empty file that contains only the file header is generated. Otherwise, an empty file is generated.
Note The writeSingleObject parameter does not take effect for ORC or Parquet files. This indicates that you cannot write a single ORC or Parquet file to OSS if multiple threads are used at a time to synchronize data. If you want to write a single file, you can set the concurrent parameter to 1. In this case, a random suffix is added to the file name. However, the data synchronization speed is affected. | No | false |
fileFormat | The format in which OSS Writer writes files to OSS. Valid values: csv: If a file is written as a CSV file, the file must follow CSV specifications. If the data in the file contains column delimiters, the column delimiters are escaped by double quotation marks ("). text: If a file is written as a text file, the data in the file is separated by column delimiters. In this case, OSS Writer does not escape the column delimiters. parquet: OSS Writer can write Parquet files to OSS. If you want to write Parquet files to OSS, you must configure the parquetschema parameter to define the related data type. orc: If you want to write ORC files to OSS, you must use the code editor.
| No | text |
compress | The compression type of the files that you want to write to OSS. This parameter is available only in the code editor. Note CSV and text files cannot be compressed. Parquet and ORC files can be compressed in a format such as Snappy and GZIP. | No | No default value |
fieldDelimiter | The column delimiter that is used in the files that you want to write to OSS. | No | , |
encoding | The encoding format of the files that you want to write to OSS. | No | utf-8 |
parquetSchema | The schema of the Parquet files that you want to write to OSS. If you set the fileFormat parameter to parquet, you must configure this parameter. Format:
message MessageTypeName {
required, dataType, columnName;
......................;
}
Fields: MessageTypeName: the name of the MessageType object. required: indicates that the column cannot be left empty. You can also specify optional based on your business requirements. We recommend that you specify optional for all columns. dataType: Parquet files support various data types, such as BOOLEAN, INT32, INT64, INT96, FLOAT, DOUBLE, BINARY, and FIXED_LEN_BYTE_ARRAY. Set this parameter to BINARY if the column stores strings.
Note Each line, including the last one, must end with a semicolon (;). Example:
message m {
optional int64 id;
optional int64 date_id;
optional binary datetimestring;
optional int32 dspId;
optional int32 advertiserId;
optional int32 status;
optional int64 bidding_req_num;
optional int64 imp;
optional int64 click_num;
}
| No | No default value |
nullFormat | The string that represents a null pointer. No standard strings can represent a null pointer in text files. You can use this parameter to define a string that represents a null pointer. For example, if you set nullFormat to null , Data Integration considers null as a null pointer. | No | No default value |
header | The headers in the files that you want to write to OSS. Example: ['id', 'name', 'age'] . | No | No default value |
maxFileSize (advanced parameter, which is available only in the code editor) | The maximum size of a single file that can be written to OSS. Default value: 100,000. Unit: MB. OSS Writer performs object rotation based on the value of this parameter. Object rotation is similar to log rotation of Log4j. When a file is uploaded to OSS in multiple parts, the maximum size of a part is 10 MB. This size is the minimum granularity used for object rotation. If you set this parameter to a value that is less than 10 MB, the maximum size of a single file that can be written to OSS is still 10 MB. The InitiateMultipartUploadRequest operation can be used to upload a file in a maximum of 10,000 parts at a time. If object rotation occurs, suffixes, such as _1, _2, and _3, are appended to the new object names that consist of prefixes and random UUIDs. Note The default unit is MB. For example, if you set the maxFileSize parameter to 300, the maximum size of a single file that can be written to OSS is 300 MB. | No | 100,000 |
suffix (advanced parameter, which is available only in the code editor) | The file name extension of the files that you want to write to OSS. For example, if you set the suffix parameter to .csv, the final name of a file written to OSS is in the fileName****.csv format. | No | No default value |