All Products
Search
Document Center

DataWorks:ListDataQualityRules

Last Updated:Dec 05, 2024

Queries a list of data quality monitoring rules by page.

Operation description

This API operation is available for all DataWorks editions.

Debugging

You can run this interface directly in OpenAPI Explorer, saving you the trouble of calculating signatures. After running successfully, OpenAPI Explorer can automatically generate SDK code samples.

Authorization information

There is currently no authorization information disclosed in the API.

Request parameters

ParameterTypeRequiredDescriptionExample
ProjectIdlongNo

The DataWorks workspace ID.

10002
DataQualityEvaluationTaskIdlongNo

The ID of the data quality monitoring task that is associated with the rule.

10000
TableGuidstringNo

The ID of the table that is limited by the rule in Data Map.

odps.unit_test.tb_unit_test
NamestringNo

The name of the rule. Fuzzy match is supported.

unit_test
PageSizeintegerNo

The number of entries per page. Default value: 10. Maximum value: 200.

10
PageNumberintegerNo

The page number. Default value: 1.

1

Response parameters

ParameterTypeDescriptionExample
object

The response parameters.

RequestIdstring

The request ID.

691CA452-D37A-4ED0-9441
PagingInfoobject

The pagination information.

PageNumberinteger

The page number.

1
PageSizeinteger

The number of entries per page.

10
TotalCountinteger

The total number of entries returned.

294
DataQualityRulesarray<object>

The rules.

DataQualityRuleobject
Idlong

The rule ID.

22130
Namestring

The rule name.

TenantIdlong

The ID of the DataWorks tenant.

100001
ProjectIdlong

The DataWorks workspace ID.

100001
Enabledboolean

Indicates whether the rule is enabled.

true
Severitystring

The strength of the rule. Valid values:

  • Normal
  • High
High
Descriptionstring

The description of the rule. The description can be up to 500 characters in length.

this is a odps _sql task
Targetobject

The monitored object of the rule.

Typestring

The type of the monitored object. Valid values:

  • Table
Table
DatabaseTypestring

The type of the database to which the table belongs. Valid values:

  • maxcompute
  • emr
  • cdh
  • hologres
  • analyticdb_for_postgresql
  • analyticdb_for_mysql
  • starrocks
maxcompute
TableGuidstring

The ID of the table that is limited by the rule in Data Map.

odps.unit_test.tb_unit_test
PartitionSpecstring

The configuration of the partitioned table.

ds=$[yyyymmdd-1]
TemplateCodestring

The ID of the template used by the rule.

system::user_defined
SamplingConfigobject

The settings for sampling.

Metricstring

The metrics used for sampling. Valid values:

  • Count: the number of rows in the table.
  • Min: the minimum value of the field.
  • Max: the maximum value of the field.
  • Avg: the average value of the field.
  • DistinctCount: the number of unique values of the field after deduplication.
  • DistinctPercent: the percentage of the number of unique values of the field after deduplication to the number of rows in the table.
  • DuplicatedCount: the number of duplicated values in the field.
  • DuplicatedPercent: the percentage of the number of duplicated values of the field to the number of rows in the table.
  • TableSize: the table size.
  • NullValueCount: the number of rows in which the field is set to null.
  • NullValuePercent: the percentage of the number of rows in which the field is set to null to the number of rows in the table.
  • GroupCount: the field value and the number of rows for each field value.
  • CountNotIn: the number of rows in which the field values are different from the referenced values that you specified in the rule.
  • CountDistinctNotIn: the number of unique values that are different from the referenced values that you specified in the rule after deduplication.
  • UserDefinedSql: indicates that the data is sampled by executing custom SQL statements.
Max
MetricParametersstring

The parameters required for sampling.

{ "Columns": [ "id", "name" ] , "SQL": "select count(1) from table;"}
SettingConfigstring

The statements that are used to configure the parameters required for sampling before you execute the sampling statements. The statements can be up to 1,000 characters in length. Only the MaxCompute database is supported.

SET odps.sql.udf.timeout=600s; SET odps.sql.python.version=cp27;
SamplingFilterstring

The statements that are used to filter unnecessary data during sampling. The statements can be up to 16,777,215 characters in length.

id IS NULL
CheckingConfigobject

The check settings for sample data.

Typestring

The threshold calculation method. Valid values:

  • Fixed
  • Fluctation
  • FluctationDiscreate
  • Auto
  • Average
  • Variance
Fixed
ReferencedSamplesFilterstring

The method that is used to query the referenced samples. To obtain some types of thresholds, you need to query reference values. In this example, an expression is used to indicate the query method of referenced samples.

{ "bizdate": [ "-1", "-7", "-1m" ] }
Thresholdsobject

The threshold settings.

Expectedobject

The expected threshold setting.

Operatorstring

The comparison operator. Valid values:

  • >
  • >=
  • <
  • <=
  • !=
  • =
>
Valuestring

The threshold value.

100.0
Warnedobject

The threshold settings for normal alerts.

Operatorstring

The comparison operator. Valid values:

  • >
  • >=
  • <
  • <=
  • !=
  • =
>
Valuestring

The threshold value.

100.0
Criticalobject

The threshold settings for critical alerts.

Operatorstring

The comparison operator. Valid values:

  • >
  • >=
  • <
  • <=
  • !=
  • =
>
Valuestring

The threshold value.

100.0
ErrorHandlersarray<object>

The operations that you can perform after the rule-based check fails.

ErrorHandlerobject

The operation that you can perform after the rule-based check fails.

Typestring

The type of the operation. Valid values:

  • SaveErrorData
SaveErrorData
ErrorDataFilterstring

The SQL statement that is used to filter failed tasks. If the rule is defined by custom SQL statements, you must specify an SQL statement to filter failed tasks.

SELECT * FROM tb_api_log WHERE id IS NULL

Examples

Sample success responses

JSONformat

{
  "RequestId": "691CA452-D37A-4ED0-9441",
  "PagingInfo": {
    "PageNumber": 1,
    "PageSize": 10,
    "TotalCount": 294,
    "DataQualityRules": [
      {
        "Id": 22130,
        "Name": "",
        "TenantId": 100001,
        "ProjectId": 100001,
        "Enabled": true,
        "Severity": "High",
        "Description": "this is a odps _sql task",
        "Target": {
          "Type": "Table",
          "DatabaseType": "maxcompute",
          "TableGuid": "odps.unit_test.tb_unit_test",
          "PartitionSpec": "ds=$[yyyymmdd-1]"
        },
        "TemplateCode": "system::user_defined",
        "SamplingConfig": {
          "Metric": "Max",
          "MetricParameters": "{ \"Columns\": [ \"id\", \"name\" ] , \"SQL\": \"select count(1) from table;\"}",
          "SettingConfig": "SET odps.sql.udf.timeout=600s; \nSET odps.sql.python.version=cp27;",
          "SamplingFilter": "id IS NULL"
        },
        "CheckingConfig": {
          "Type": "Fixed",
          "ReferencedSamplesFilter": "{ \"bizdate\": [ \"-1\", \"-7\", \"-1m\" ] }",
          "Thresholds": {
            "Expected": {
              "Operator": ">",
              "Value": "100.0"
            },
            "Warned": {
              "Operator": ">",
              "Value": "100.0"
            },
            "Critical": {
              "Operator": ">",
              "Value": "100.0"
            }
          }
        },
        "ErrorHandlers": [
          {
            "Type": "SaveErrorData\n",
            "ErrorDataFilter": "SELECT * FROM tb_api_log WHERE id IS NULL"
          }
        ]
      }
    ]
  }
}

Error codes

For a list of error codes, visit the Service error codes.

Change history

Change timeSummary of changesOperation
2024-12-04The response structure of the API has changedView Change Details
2024-11-06The response structure of the API has changedView Change Details