Datadog是云上应用的监控和分析平台,用于自动采集和分析日志、指标、链路追踪等数据,监控基础设施事件、云服务事件。Datadog为服务器、应用程序以及采集到的各种数据提供了很好的可观测效果。您只需在Datadog集成的Webhook中配置日志服务的开放告警接口URL,即可将Datadog的告警消息发送给日志服务。
Datadog配置
- 登录Datadog控制台。
- 配置Webhook。
- 配置通知渠道。
- 在顶部导航栏中,选择 。
- 单击目标Monitor对应的图标。
- 配置Notify your team为您在步骤2中所创建的Webhook。
- 单击Save。
Datadog告警消息
如果您将所有由Datadog提供的但未被使用的变量都添加到了annotations字段中,那么日志服务将收到如下所示的Datadog告警消息。
{
"alert_instance_id": "123456",
"alert_id": "123456",
"alert_name": "STOP on host:abcdefgh",
"alert_time": "1628647425000",
"fire_time": "1628647425000",
"resolve_time": "1627561306000",
"status": "Triggered",
"labels": {
"tags": "ali,host:abcdefgh,monitor"
},
"annotations": {
"title": "[P1] [Triggered on {host:abcdefgh}] STOP",
"event_msg": "%%%\nwarning\nhost stop\n @webhook-webhook-test-all\n\nThe monitor was last triggered at Thu Jul 29 2021 12:21:45 UTC.\n\n- - -\n\n[[Monitor Status](https://app.datadoghq.com/monitors/1234?to_ts=1234&group=host%3Aabcdefgh&from_ts=1627560405000)] \u00b7 [[Edit Monitor](https://app.datadoghq.com/monitors#1234/edit)] \u00b7 [[View abcdefgh](https://app.datadoghq.com/infrastructure?filter=abcdefgh)] \u00b7 [[Show Processes](https://app.datadoghq.com/process?sort=memory%2CASC&to_ts=1234&tags=host%abcdefgh&from_ts=1627560405000&live=false&showSummaryGraphs=true)]\n%%%",
"text_only_msg": "\nwarning\nhost stop\n @webhook-webhook-test-all\n\nMetric Graph: https://app.datadoghq.com/monitors/1234?to_ts=1627561365000&group=host%abcdefgh&from_ts=1627557705000 \u00b7 Monitor Status: https://app.datadoghq.com/monitors/1234?group=host%abcdefgh \u00b7 Edit Monitor: https://app.datadoghq.com/monitors#42655965/edit \u00b7 Event URL: https://app.datadoghq.com/event/event?id=1234 \u00b7 View abcdefgh: https://app.datadoghq.com/infrastructure?filter=abcdefgh \u00b7 Show Processes: https://app.datadoghq.com/process?sort=memory%2CASC&to_ts=None&tags=host%abcdefgh&from_ts=None&live=false&showSummaryGraphs=true",
"alert_metric": "null",
"alert_query": "\"datadog.agent.up\".over(\"host:abcdefgh\").by(\"host\").last(2).count_by_status()",
"alert_scope": "host:abcdefgh",
"alert_status": "",
"alert_type": "error",
"email": "",
"event_type": "service_check",
"hostname": "abcdefgh",
"logs_sample": "null",
"metric_namespace": "",
"priority": "normal",
"user": "null",
"username": "",
"__aggreg_key__": "a1b2c3",
"__alert_cycle_key__": "123456789",
"__incident_attachments__": "null",
"__incident_commander__": "null",
"__incident_customer_impact__": "null",
"__incident_fildes__": "null",
"__incident_public_id__": "null",
"__incident_title": "null",
"__incident_url__": "null",
"__org_id__": "123",
"__org_name__": "ali",
"__security_rule_name__": "null",
"__security_signal_id__": "null",
"__security_signal_severity__": "null",
"__security_signal_title__": "null",
"__security_signal_msg__": "null",
"__security_signal_attributes__": "null",
"__security_rule_id__": "null",
"__security_rule_query__": "$SECURITY_RULE_QUERY",
"__security_rule_group_by_fields__": "null",
"__security_rule_type__": "null",
"__link_snapshot_url__": "null",
"__synthetics_test_name__": "null",
"__synthetics_first_failing_step_name__": "null"
},
"severity": "P1",
"drill_down_query": "https://app.datadoghq.com/event/event?id=123456"
}
字段映射
Datadog告警消息被接入到日志服务后,映射为日志服务告警内容。示例如下:
{
"aliuid": "aliuid1",
"alert_instance_id": "123456",
"alert_id": "123456",
"alert_type": "sls_pub",
"alert_name": "STOP on host:abcdefgh",
"region": "",
"project": "",
"project_id": 0,
"next_eval_interval": 0,
"alert_time": 1628647425,
"fire_time": 1628647425,
"fire_results": null,
"fire_results_count": 0,
"resolve_time": 0,
"status": "firing",
"results": null,
"labels":{
"__ali__": "ali",
"__host__": "abcdefgh",
"__monitor__": "monitor"
},
"annotations":{
"__aggreg_key__": "1a2b3c4d",
"__alert_cycle_key__": "123456",
"__config_app__": "sls_pub_alert",
"__link_edit_monitor__": "https://app.datadoghq.com/monitors#1234/edit",
"__link_metric_graph__": "https://app.datadoghq.com/monitors/1234?to_ts=1628647485000&group=host%abcdefgh&from_ts=1628643825000",
"__link_monitor_status__": "https://app.datadoghq.com/monitors/123?group=host%abcdefgh",
"__link_show_processes__": "https://app.datadoghq.com/process?sort=memory%2CASC&to_ts=None&tags=host%abcdefgh&from_ts=None&live=false&showSummaryGraphs=true",
"__link_view_izbp****hqpwt26z__": "https://app.datadoghq.com/infrastructure?filter=abcdefgh",
"__org_id__": "579186",
"__org_name__": "ali",
"__pub_alert_app__": "",
"__pub_alert_protocol__": "datadog",
"__pub_alert_region__": "",
"__pub_alert_service__": "",
"alert_query": "\"datadog.agent.up\".over(\"host:abcdefgh\").by(\"host\").last(2).count_by_status()",
"alert_scope": "host:izbp1cerzh0yyvrhqpwt26z",
"alert_type": "error",
"desc": "warning\nhost stop\n@webhook-test\nThe monitor was last triggered at Wed Aug 11 2021 02:03:45 UTC.\n- - -\n",
"event_type": "service_check",
"hostname": "abcdefgh",
"priority": "normal",
"title": "[P1] [Triggered on {host:abcdefgh}] STOP"
},
"severity": 10,
"policy":{
"alert_policy_id": "",
"action_policy_id": "",
"use_default": false,
"repeat_interval": "0s"
},
"template": null,
"drill_down_query": "https://app.datadoghq.com/event/event?id=123456"
}
日志服务 | Datadog | 说明 |
---|---|---|
aliuid | 无 | 用于接入告警的开放告警应用所属的阿里云账号ID |
alert_id | alert_id | 告警监控规则的ID |
alert_instance_id | alert_instance_id | 告警消息的ID |
alert_type | 无 | 告警类型,固定为sls_pub。 |
alert_name | alert_name | 告警监控规则的名称 |
status | status | 告警状态。
|
next_eval_interval | 无 | 告警评估间隔时间,固定为0。 |
alert_time | alert_time | 告警触发时间。 |
fire_time | fire_time | 告警首次触发时间。 |
resolve_time | resolve_time | 告警恢复时间。
|
labels | labels | 告警标签信息。
Datadog告警消息的 labels字段中的tags字段值将被英文逗号(,)拆分为多个字符串。
例如
"ali,host:1a2b3c4d" 将被解析成如下格式。
另外Datadog告警消息的labels字段中,其余未被使用且字段值非空的字段和其字段值都会被添加到日志服务告警消息的labels字段中。 |
annotations | annotations | Datadog告警被接入到日志服务后,日志服务告警的annotations字段中将添加如下额外的字段。
以下字段从Datadog告警消息中的text_only_msg字段中解析得到。
另外Datadog告警消息annotations字段中,其余未被使用且字段值非空的字段和其字段值都会被添加到日志服务告警消息的annotations字段中。 |
severity | severity | 告警严重度,Datadog告警严重度与日志服务告警严重度的映射关系如下:
说明 如果Datadog告警中未定义严重度,则日志服务告警严重度映射为中。
|
policy | 无 | 开放告警应用中配置的告警策略。更多信息,请参见Policy结构。 |
project | 无 | 告警中心所属的Project。更多信息,请参见项目(Project)。 |
drill_down_query | drill_down_query | 单击字段值中的链接,可跳转到Datadog告警事件的管理页面。 |