All Products
Search
Document Center

Alibaba Cloud Service Mesh:Configure SLOs for applications in ASM

Last Updated:Mar 11, 2026

Service level objectives (SLOs) let you set measurable availability targets for your applications and receive alerts when error budget burns too fast. By defining SLOs in Service Mesh (ASM), you track the ratio of failed requests to total requests over a rolling time window and get notified at different severity levels before users notice degradation.

Key concepts

TermDefinition
Service level indicator (SLI)A metric that measures service reliability. In ASM, the default SLI tracks the ratio of failed requests (5xx and 429 responses) to total requests.
Service level objective (SLO)A target percentage for an SLI over a rolling time window, such as 99% availability over 30 days.
Error budgetThe acceptable amount of unreliability derived from an SLO target. A 99% SLO allows a 1% error budget.
Burn rateHow fast the error budget is consumed. A burn rate of 1 means the budget will be exhausted exactly at the end of the time window. Higher values trigger alerts.

For a full explanation of SLO concepts in ASM, see SLO overview.

Prerequisites

Before you begin, make sure you have:

Step 1: Deploy the sample application

This tutorial uses the HTTPBin application as an example. If you already have an application deployed in your mesh, skip to Step 3.

  1. Create a file named httpbin.yaml with the following content:

    httpbin.yaml

    ##################################################################################################
    # httpbin service
    ##################################################################################################
    apiVersion: v1
    kind: ServiceAccount
    metadata:
      name: httpbin
    ---
    apiVersion: v1
    kind: Service
    metadata:
      name: httpbin
      labels:
        app: httpbin
        service: httpbin
    spec:
      ports:
      - name: http
        port: 8000
        targetPort: 80
      selector:
        app: httpbin
    ---
    apiVersion: apps/v1
    kind: Deployment
    metadata:
      name: httpbin
    spec:
      replicas: 1
      selector:
        matchLabels:
          app: httpbin
          version: v1
      template:
        metadata:
          labels:
            app: httpbin
            version: v1
        spec:
          serviceAccountName: httpbin
          containers:
          - image: docker.io/kennethreitz/httpbin
            imagePullPolicy: IfNotPresent
            name: httpbin
            ports:
            - containerPort: 80
  2. Connect to the ACK cluster with kubectl and deploy the application: For instructions on connecting with kubectl, see Obtain the kubeconfig file of a cluster and use kubectl to connect to the cluster.

    kubectl apply -f httpbin.yaml

Step 2: Create a virtual service and an Istio gateway

  1. Create a file named httpbin-gateway.yaml with the following content:

    httpbin-gateway.yaml

    apiVersion: networking.istio.io/v1alpha3
    kind: Gateway
    metadata:
      name: httpbin-gateway
    spec:
      selector:
        istio: ingressgateway
      servers:
      - port:
          number: 80
          name: http
          protocol: HTTP
        hosts:
        - "*"
    ---
    apiVersion: networking.istio.io/v1alpha3
    kind: VirtualService
    metadata:
      name: httpbin
    spec:
      hosts:
      - "*"
      gateways:
      - httpbin-gateway
      http:
      - route:
        - destination:
            host: httpbin
            port:
              number: 8000
  2. Connect to the ASM instance with kubectl and deploy the gateway and virtual service: For instructions on connecting to the ASM control plane, see Use kubectl on the control plane to access Istio resources.

    kubectl apply -f httpbin-gateway.yaml
  3. Verify the deployment by opening http://<ingress-gateway-ip> in your browser. If the HTTPBin page loads, the application is running. To find your ingress gateway IP, see Use Istio resources to route traffic to different versions of a service.

Step 3: Create an SLO

This example creates an SLO for the HTTPBin service in the default namespace with the following settings:

ParameterValueDescription
Duration30 daysRolling time window for the SLO
Plugin typeavailabilitySLI based on request success rate
Objective99%Target availability percentage
Alert levelsPage, TicketPage for urgent issues, Ticket for non-urgent
  1. Log on to the ASM console. In the left-side navigation pane, choose Service Mesh > Mesh Management.

  2. On the Mesh Management page, click the name of your ASM instance. In the left-side navigation pane, choose Observability Management Center > SLO Configuration.

  3. Select the default namespace from the Namespace drop-down list. In the httpbin service row, click Create in the Actions column.

  4. In the Basic Information section, set Duration to 30d.

  5. Click the SLO rule tab and configure the following settings:

    • Set Name to asm-slo.

    • Set Plugin type to availability.

    • Set Objective to 99.

    • Turn on Enable alerting rules and set Alerting rules name to asm-alert.

    • Turn on Enable alerting rule with Ticket level.

    • Turn on Enable alerting rule with Page level.

    SLO configuration

  6. (Optional) Click Preview at the bottom of the page to review the configuration. Confirm that the settings are correct and click Submit.

  7. Click Create at the bottom of the page.

For a detailed explanation of each field, see Description of SLO CRD fields.

Step 4: View the generated Prometheus rules

After you create the SLO, ASM automatically generates Prometheus recording and alerting rules. These rules define how SLI metrics are calculated across multiple time windows and when alerts fire based on error budget burn rate.

To view the rules, find the httpbin service on the SLO Configuration page and click View Prometheus rules in the Actions column.

View Prometheus rules

Generated rule groups

The generated Prometheus rules fall into three groups:

Rule groupPurpose
SLI recordings (asm-slo-sli-recordings-httpbin-asm-slo)Calculates error ratios across sliding windows (5m, 30m, 1h, 2h, 6h, 1d, 3d, and 30d) using the istio_requests_total metric. Errors include 5xx and 429 response codes.
Meta recordings (asm-slo-meta-recordings-httpbin-asm-slo)Stores SLO metadata: objective (0.99), error budget (0.01), time period (30 days), current burn rate, period burn rate, and remaining error budget.
Alerts (asm-slo-alerts-httpbin-asm-slo)Defines multi-window, multi-burn-rate alerts at two severity levels.

Alert thresholds

ASM uses a multi-window, multi-burn-rate approach to detect error budget consumption at different speeds:

SeverityShort windowLong windowBurn rate factorBudget consumed
Page5m > 14.4x1h > 14.4x14.42% in 1 hour
Page30m > 6x6h > 6x65% in 6 hours
Ticket2h > 3x1d > 3x310% in 1 day
Ticket6h > 1x3d > 1x110% in 3 days

A Page-level alert fires when error budget burns fast enough to exhaust within hours, requiring immediate attention. A Ticket-level alert fires for slower burns that need action but are less urgent.

Full Prometheus rule YAML

groups:
- name: asm-slo-sli-recordings-httpbin-asm-slo
  rules:
  - record: slo:sli_error:ratio_rate5m
    expr: "(\n(\n  sum(rate(istio_requests_total{ destination_service_name=\"httpbin\",destination_service_namespace=\"default\",response_code=~\"(5..|429)\"
      }[5m])) \n  /          \n  (sum(rate(istio_requests_total{ destination_service_name=\"httpbin\",destination_service_namespace=\"default\"
      }[5m])) > 0)\n) OR on() vector(0)\n)"
    labels:
      asm_slo: asm-slo
      slo_id: httpbin-asm-slo
      slo_service: httpbin
      slo_window: 5m
  - record: slo:sli_error:ratio_rate30m
    expr: "(\n(\n  sum(rate(istio_requests_total{ destination_service_name=\"httpbin\",destination_service_namespace=\"default\",response_code=~\"(5..|429)\"
      }[30m])) \n  /          \n  (sum(rate(istio_requests_total{ destination_service_name=\"httpbin\",destination_service_namespace=\"default\"
      }[30m])) > 0)\n) OR on() vector(0)\n)"
    labels:
      asm_slo: asm-slo
      slo_id: httpbin-asm-slo
      slo_service: httpbin
      slo_window: 30m
  - record: slo:sli_error:ratio_rate1h
    expr: "(\n(\n  sum(rate(istio_requests_total{ destination_service_name=\"httpbin\",destination_service_namespace=\"default\",response_code=~\"(5..|429)\"
      }[1h])) \n  /          \n  (sum(rate(istio_requests_total{ destination_service_name=\"httpbin\",destination_service_namespace=\"default\"
      }[1h])) > 0)\n) OR on() vector(0)\n)"
    labels:
      asm_slo: asm-slo
      slo_id: httpbin-asm-slo
      slo_service: httpbin
      slo_window: 1h
  - record: slo:sli_error:ratio_rate2h
    expr: "(\n(\n  sum(rate(istio_requests_total{ destination_service_name=\"httpbin\",destination_service_namespace=\"default\",response_code=~\"(5..|429)\"
      }[2h])) \n  /          \n  (sum(rate(istio_requests_total{ destination_service_name=\"httpbin\",destination_service_namespace=\"default\"
      }[2h])) > 0)\n) OR on() vector(0)\n)"
    labels:
      asm_slo: asm-slo
      slo_id: httpbin-asm-slo
      slo_service: httpbin
      slo_window: 2h
  - record: slo:sli_error:ratio_rate6h
    expr: "(\n(\n  sum(rate(istio_requests_total{ destination_service_name=\"httpbin\",destination_service_namespace=\"default\",response_code=~\"(5..|429)\"
      }[6h])) \n  /          \n  (sum(rate(istio_requests_total{ destination_service_name=\"httpbin\",destination_service_namespace=\"default\"
      }[6h])) > 0)\n) OR on() vector(0)\n)"
    labels:
      asm_slo: asm-slo
      slo_id: httpbin-asm-slo
      slo_service: httpbin
      slo_window: 6h
  - record: slo:sli_error:ratio_rate1d
    expr: "(\n(\n  sum(rate(istio_requests_total{ destination_service_name=\"httpbin\",destination_service_namespace=\"default\",response_code=~\"(5..|429)\"
      }[1d])) \n  /          \n  (sum(rate(istio_requests_total{ destination_service_name=\"httpbin\",destination_service_namespace=\"default\"
      }[1d])) > 0)\n) OR on() vector(0)\n)"
    labels:
      asm_slo: asm-slo
      slo_id: httpbin-asm-slo
      slo_service: httpbin
      slo_window: 1d
  - record: slo:sli_error:ratio_rate3d
    expr: "(\n(\n  sum(rate(istio_requests_total{ destination_service_name=\"httpbin\",destination_service_namespace=\"default\",response_code=~\"(5..|429)\"
      }[3d])) \n  /          \n  (sum(rate(istio_requests_total{ destination_service_name=\"httpbin\",destination_service_namespace=\"default\"
      }[3d])) > 0)\n) OR on() vector(0)\n)"
    labels:
      asm_slo: asm-slo
      slo_id: httpbin-asm-slo
      slo_service: httpbin
      slo_window: 3d
  - record: slo:sli_error:ratio_rate30d
    expr: |
      sum_over_time(slo:sli_error:ratio_rate5m{asm_slo="asm-slo", slo_id="httpbin-asm-slo", slo_service="httpbin"}[30d])
      / ignoring (slo_window)
      count_over_time(slo:sli_error:ratio_rate5m{asm_slo="asm-slo", slo_id="httpbin-asm-slo", slo_service="httpbin"}[30d])
    labels:
      slo_window: 30d
- name: asm-slo-meta-recordings-httpbin-asm-slo
  rules:
  - record: slo:objective:ratio
    expr: vector(0.99)
    labels:
      asm_slo: asm-slo
      slo_id: httpbin-asm-slo
      slo_service: httpbin
  - record: slo:error_budget:ratio
    expr: vector(1-0.99)
    labels:
      asm_slo: asm-slo
      slo_id: httpbin-asm-slo
      slo_service: httpbin
  - record: slo:time_period:days
    expr: vector(30)
    labels:
      asm_slo: asm-slo
      slo_id: httpbin-asm-slo
      slo_service: httpbin
  - record: slo:current_burn_rate:ratio
    expr: |
      slo:sli_error:ratio_rate5m{asm_slo="asm-slo", slo_id="httpbin-asm-slo", slo_service="httpbin"}
      / on(slo_id, asm_slo, slo_service) group_left
      slo:error_budget:ratio{asm_slo="asm-slo", slo_id="httpbin-asm-slo", slo_service="httpbin"}
    labels:
      asm_slo: asm-slo
      slo_id: httpbin-asm-slo
      slo_service: httpbin
  - record: slo:period_burn_rate:ratio
    expr: |
      slo:sli_error:ratio_rate30d{asm_slo="asm-slo", slo_id="httpbin-asm-slo", slo_service="httpbin"}
      / on(slo_id, asm_slo, slo_service) group_left
      slo:error_budget:ratio{asm_slo="asm-slo", slo_id="httpbin-asm-slo", slo_service="httpbin"}
    labels:
      asm_slo: asm-slo
      slo_id: httpbin-asm-slo
      slo_service: httpbin
  - record: slo:period_error_budget_remaining:ratio
    expr: 1 - slo:period_burn_rate:ratio{asm_slo="asm-slo", slo_id="httpbin-asm-slo",
      slo_service="httpbin"}
    labels:
      asm_slo: asm-slo
      slo_id: httpbin-asm-slo
      slo_service: httpbin
  - record: asm_slo_info
    expr: vector(1)
    labels:
      asm_slo: asm-slo
      slo_id: httpbin-asm-slo
      slo_mode: cli-gen-prom
      slo_objective: "99"
      slo_service: httpbin
      slo_spec: prometheus/v1
      slo_version: dev
- name: asm-slo-alerts-httpbin-asm-slo
  rules:
  - alert: asm-alert
    expr: |
      (
          (slo:sli_error:ratio_rate5m{asm_slo="asm-slo", slo_id="httpbin-asm-slo", slo_service="httpbin"} > (14.4 * 0.01))
          and ignoring (slo_window)
          (slo:sli_error:ratio_rate1h{asm_slo="asm-slo", slo_id="httpbin-asm-slo", slo_service="httpbin"} > (14.4 * 0.01))
      )
      or ignoring (slo_window)
      (
          (slo:sli_error:ratio_rate30m{asm_slo="asm-slo", slo_id="httpbin-asm-slo", slo_service="httpbin"} > (6 * 0.01))
          and ignoring (slo_window)
          (slo:sli_error:ratio_rate6h{asm_slo="asm-slo", slo_id="httpbin-asm-slo", slo_service="httpbin"} > (6 * 0.01))
      )
    labels:
      slo_severity: page
    annotations:
      summary: '{{$labels.slo_service}} {{$labels.asm_slo}} SLO error budget burn
        rate is over expected.'
      title: (page) {{$labels.slo_service}} {{$labels.asm_slo}} SLO error budget burn
        rate is too fast.
  - alert: asm-alert
    expr: |
      (
          (slo:sli_error:ratio_rate2h{asm_slo="asm-slo", slo_id="httpbin-asm-slo", slo_service="httpbin"} > (3 * 0.01))
          and ignoring (slo_window)
          (slo:sli_error:ratio_rate1d{asm_slo="asm-slo", slo_id="httpbin-asm-slo", slo_service="httpbin"} > (3 * 0.01))
      )
      or ignoring (slo_window)
      (
          (slo:sli_error:ratio_rate6h{asm_slo="asm-slo", slo_id="httpbin-asm-slo", slo_service="httpbin"} > (1 * 0.01))
          and ignoring (slo_window)
          (slo:sli_error:ratio_rate3d{asm_slo="asm-slo", slo_id="httpbin-asm-slo", slo_service="httpbin"} > (1 * 0.01))
      )
    labels:
      slo_severity: ticket
    annotations:
      summary: '{{$labels.slo_service}} {{$labels.asm_slo}} SLO error budget burn
        rate is over expected.'
      title: (ticket) {{$labels.slo_service}} {{$labels.asm_slo}} SLO error budget
        burn rate is too fast.

What to do next

Import the generated Prometheus rules into your Prometheus system and visualize SLO metrics in Grafana: