Parameter | Required | Valid value | Description |
uri | Yes | String | The URI for accessing HDFS. If the specified URI does not exist or the matched files are all empty, the HDFS TVF returns an empty result set. |
fs.defaultFS | Yes | String | The hostname and port number of HDFS. |
hadoop.username | Yes | String | The username that is used to access HDFS. The value cannot be an empty string. |
hadoop.security.authentication | No | | The authentication method for HDFS. Valid values: Simple and Kerberos. |
hadoop.kerberos.principal | No | String | The Kerberos principal if Kerberos authentication is enabled for HDFS. |
hadoop.kerberos.keytab | No | String | The path of the Kerberos keytab file if Kerberos authentication is enabled for HDFS. |
dfs.client.read.shortcircuit | No | | Specifies whether to read local HDFS short circuit data. The value is of the BOOLEAN type. |
dfs.domain.socket.path | No | String | The path that points to a UNIX domain socket for the communication between the DataNode and the local HDFS client. If you specify the string "_PORT" in the path, the string is replaced by the TCP port of the DataNode. |
dfs.nameservices | No | String | The logical names of the nameservices that provide services. This parameter corresponds to the dfs.nameservices field in the core-site.xml file. |
dfs.ha.namenodes.your-nameservices | No Note This parameter is required if Hadoop High Availability (HA) deployment is used. | String | The logical names of the NameNodes. |
dfs.namenode.rpc-address.your-nameservices.your-namenode | No Note This parameter is required if Hadoop HA deployment is used. | String | The HTTP URL to which the NameNode listens. |
dfs.client.failover.proxy.provider.your-nameservices | No Note This parameter is required if Hadoop HA deployment is used. | String | The implementation class of the failover proxy provider for client connections to the NameNode in the available state. |
read_json_by_line | No | | Specifies whether to read JSON-formatted data by row. Default value: true. |
num_as_string | No | | Specifies whether to process numbers as strings. Default value: false. |
fuzzy_parse | No | | Specifies whether to accelerate the import efficiency of JSON-formatted data. Default value: false. |
jsonpaths | No | String | The fields to be extracted from JSON-formatted data. Format: jsonpaths: [\"$.k2\", \"$.k1\"] . |
strip_outer_array | No | | Specifies whether to display JSON-formatted data as an array. Each element is considered as a row of data. Default value: false. Format: strip_outer_array: true . |
json_root | No | String | The root node of JSON-formatted data. ApsaraDB for SelectDB extracts and parses the elements of the root node that is specified by the json_root parameter. By default, this parameter is left empty. Format: json_root: $.RECORDS . |
trim_double_quotes | No | | Specifies whether to trim the outermost double quotation marks (") of each field in the CSV file. Default value: false. |
skip_lines | No | [0-Integer.MaxValue] | The value is of the INTEGER type. Default value: 0. Specifies whether to skip the first few rows of the CSV file. This parameter becomes invalid if the format parameter is set to csv_with_names or csv_with_names_and_types . |
path_partition_keys | No | String | The names of the partition key columns that are carried in the specified file path. For example, if the file path is /path/to/city=beijing/date="2023-07-09", set this parameter to city,date. In this case, ApsaraDB for SelectDB automatically reads the corresponding column names and column values from the path during data import. |