JSON - Realtime Compute for Apache Flink - Alibaba Cloud Documentation Center

This topic describes how to use the JSON format and data type mappings.

Background information

The JSON format can be used to read and write JSON data based on the JSON structure. The JSON structure is automatically inferred from the table schema. Connectors that support the JSON format include Apache Kafka connector, Upsert Kafka connector, Elasticsearch connector, Object Storage Service (OSS) connector, ApsaraDB for MongoDB connector, and StarRocks connector.

Sample code

The following sample code provides an example on how to create a table in the JSON format by using the Apache Kafka connector.

CREATE TABLE Orders (
orderId INT,
product STRING,
orderInfo MAP<STRING, STRING>,
orderTime TIMESTAMP(3),
WATERMARK FOR orderTime AS orderTime - INTERVAL '5' SECOND
) WITH (
'connector' = 'kafka',
'topic' = 'test-topic',
'properties.bootstrap.servers' = 'localhost:9092',
'format' = 'json',
'json.fail-on-missing-field' = 'false',
'json.ignore-parse-errors' = 'true'
)

Parameters

Parameter	Required	Default value	Data type	Description
format	Yes	(none)	STRING	The format that you declare to use. If you want to use the JSON format, set this parameter to json.
json.fail-on-missing-field	No	false	BOOLEAN	Valid values: true: If the field that needs to be parsed does not exist, the current field or row is skipped. false: If the field that needs to be parsed does not exist, an error is returned and the deployment fails to start. This is the default value.
json.ignore-parse-errors	No	false	BOOLEAN	Valid values: true: If the parsing fails, the current field or row is skipped. false: If the parsing fails, an error is returned and the deployment fails to start. This is the default value.
json.timestamp-format.standard	No	SQL	STRING	The formats of the input timestamp and output timestamp. Valid values: SQL: The input timestamp in the yyyy-MM-dd HH:mm:ss.s{precision} format is parsed. For example, the input timestamp is 2020-12-30 12:13:14.123. The output timestamp is in the same format as the input timestamp. ISO-8601: The input timestamp in the yyyy-MM-ddTHH:mm:ss.s{precision} format is parsed. For example, the input timestamp is 2020-12-30T12:13:14.123. The output timestamp is in the same format as the input timestamp.
json.map-null-key.mode	No	FAIL	STRING	The method that is used to handle a null key value in the map. Valid values: FAIL: An error is returned if a key value in the map is null. DROP: Data whose key value is null in the map is discarded. LITERAL: A string constant is used to replace a null key value in the map. The value of the string constant is specified by the json.map-null-key.literal parameter.
json.map-null-key.literal	No	null	STRING	If the json.map-null-key.mode parameter is set to LITERAL, the specified string constant is used to replace the null key value in the map.
json.encode.decimal-as-plain-number	No	false	BOOLEAN	Valid values: true: All data of the DECIMAL type remains unchanged and is not represented in the scientific notation format. For example, 0.000000027 is represented as 0.000000027. false: All data of the DECIMAL type is represented in the scientific notation format. For example, 0.000000027 is represented as 2.7E-8.
json.write-null-properties	No	true	BOOLEAN	Specifies whether to write an empty value of a column to a JSON string. Valid values: true: The empty value of a column is written as a null value to the JSON string. false: The empty value of a column is not written to the JSON string. Note You can configure this parameter only when Realtime Compute for Apache Flink uses Ververica Runtime (VVR) 8.0.6 or later.

Data type mappings

The JSON structure is automatically inferred from the table schema. In Realtime Compute for Apache Flink, the JSON format calls the jackson databind API to parse and generate JSON data. The following table describes the mappings between Flink SQL data types and JSON data types.

Data type of Realtime Compute for Apache Flink SQL	JSON data type
CHAR / VARCHAR / STRING	STRING
BOOLEAN	BOOLEAN
BINARY / VARBINARY	STRING with encoding: base64
DECIMAL	NUMBER
TINYINT	NUMBER
SMALLINT	NUMBER
INT	NUMBER
BIGINT	NUMBER
FLOAT	NUMBER
DOUBLE	NUMBER
DATE	STRING with format: date
TIME	STRING with format: time
TIMESTAMP	STRING with format: date-time
TIMESTAMP_WITH_LOCAL_TIME_ZONE	STRING with format: date-time (with UTC time zone)
INTERVAL	NUMBER
ARRAY	ARRAY
MAP / MULTISET	OBJECT
ROW	OBJECT

Others

Data that is written to OSS cannot be written as JSON files. For more information, see FLINK-30635.