Parameter | Description | Required | Default value |
datasource | The name of the data source. It must be the same as the name of the added data source. You can add data sources by using the code editor. | Yes | No default value |
selectedDatabase | The schema of the database from which you want to read data. | Yes | No default value |
table | The name of the table from which you want to read data. The name of the table must be in the schema.tableName format. Note For example, if the selectedDatabase parameter is set to AUTOTEST and the table name is table01 , you must set the table parameter to AUTOTEST.table01 . | Yes | No default value |
column | The names of the columns from which you want to read data. Specify the names in a JSON array. The default value is ["*"], which indicates all the columns in the source table. You can select specific columns to read. The column order can be changed. This indicates that you can specify columns in an order different from the order specified by the schema of the source table. Constants are supported. The column names must be arranged in JSON format.
["id", "1", "'mingya.wmy'", "null", "to_char(a + 1)", "2.3" , "true"]
id: a column name. 1: an integer constant. 'mingya.wmy': a string constant, which is enclosed in single quotation marks ('). null: a null pointer. to_char(a + 1): a function expression. 2.3: a floating-point constant. true: a Boolean value.
The column parameter cannot be left empty.
| Yes | No default value |
splitFactor | The sharding factor, which determines the number of parts into which data to be synchronized is sharded. If you configure parallelism for your batch synchronization task, the number of parts is calculated based on the following formula: Number of parallel threads × Sharding factor. For example, if the number of parallel threads and the sharding factor are 5, the number of parts into which data to be synchronized is sharded is 25. Note We recommend that you specify a sharding factor that ranges from 1 to 100. If you specify a sharding factor that is greater than 100, an out of memory (OOM) error may occur. | No | 5 |
splitMode | The shard mode. Valid values: averageInterval: average sampling. In this mode, the maximum and minimum values of all data are identified based on the splitPk parameter. Then, data is evenly distributed based on the number of shards. randomSampling: random sampling. In this mode, data entries are randomly identified as sharding points.
Note The splitMode parameter must be used together with the splitPk parameter. If the splitPk parameter is set to a numeric field, set the splitMode parameter to averageInterval. If the splitPk parameter is set to a string field, set the splitMode parameter to randomSampling.
| No | randomSampling |
splitPk | The field that is used for data sharding when Oracle Reader reads data. If you configure this parameter, the source table is sharded based on the value of this parameter. Data Integration then runs parallel threads to read data. This improves data synchronization efficiency. We recommend that you set the splitPk parameter to the name of a primary key column of the table. Data can be evenly distributed to different shards based on the primary key column, instead of being intensively distributed only to specific shards. The splitPk parameter supports data sharding for data types such as numeric and string. The splitMode parameter must be used together with the splitPk parameter. If the splitPk parameter is set to a numeric field, set the splitMode parameter to averageInterval. If the splitPk parameter is set to a string field, set the splitMode parameter to randomSampling.
If you do not configure the splitPk parameter, Oracle Reader uses a single thread to read all data in the source table.
Note If you use Oracle Reader to read data from a view, you cannot set the splitPk parameter to a field of the ROWID data type. | No | No default value |
where | The WHERE clause. Oracle Reader generates an SQL statement based on the settings of the table, column, and where parameters and uses the statement to read data. For example, you can set this parameter to row_number() in a test. You can use the WHERE clause to read incremental data. If the where parameter is not provided or is left empty, Data Integration reads all data.
| No | No default value |
querySql (available only in the code editor) | The SQL statement that is used for refined data filtering. If you configure this parameter, Data Integration filters data based on the value of this parameter. For example, if you want to join multiple tables for data synchronization, you can set this parameter to select a,b from table_a join table_b on table_a.id = table_b.id . If you configure this parameter, Oracle Reader ignores the settings of the table, column, and where parameters. | No | No default value |
fetchSize | The number of data records to read at a time. This parameter determines the number of interactions between Data Integration and the database and affects read efficiency. Note If you set this parameter to a value greater than 2048, an OOM error may occur during data synchronization. | No | 1,024 |