Benefits - Application Real-Time Monitoring Service - Alibaba Cloud Documentation Center

Managed Service for Prometheus is a fully managed, Prometheus-compatible monitoring service that lets you monitor a wide variety of components and collect metrics from mainstream open-source infrastructures. It provides the same open-source Prometheus experience -- PromQL, Grafana dashboards, and native collection rules -- without the operational overhead of self-hosted Prometheus infrastructure.

Out-of-the-box monitoring

Set up monitoring in minutes instead of days:

Instant Kubernetes monitoring: Create a Prometheus instance for any Kubernetes cluster or cloud resource directly from the Application Real-Time Monitoring Service (ARMS) console. A container monitoring setup that typically takes 3 days to build from scratch takes 10 minutes with Managed Service for Prometheus.
Built-in alerting and dashboards: Pre-configured Grafana dashboards and alerting rules cover common application components and cloud services. No manual dashboard creation or Alertmanager installation required.
Managed operations: Health inspection, automatic agent upgrades, and visualized configuration are built in, reducing the operational burden of self-hosted Prometheus.

Cost-effective

Free Kubernetes metrics: Monitor Kubernetes components at no cost with a set of included metrics.
No infrastructure to manage: The service is fully managed. You do not need to purchase or maintain additional servers, and nearly all O&M overhead is eliminated.
Fast time to value: Integrates directly with Container Service for Kubernetes (ACK). Start monitoring ACK clusters without building a separate monitoring stack.

Compatible with open-source Prometheus

If you already run Prometheus, your existing configuration works here:

Native collection rules: Supports Prometheus.yaml scrape configs, ServiceMonitor custom resources, and annotation-based auto-discovery. ServiceMonitor is especially useful for monitoring custom Kubernetes clusters.
Standard query and data model: Supports PromQL queries, custom multi-dimensional data models, and the HTTP API -- the same interfaces you already use with open-source Prometheus.
Flexible service discovery: Static file configurations and dynamic discovery mechanisms make it straightforward to migrate existing monitoring setups, integrate new data sources, and discover monitoring targets.

Scalable cloud storage

No storage ceiling: Cloud-based distributed storage scales with your data. There is no upper limit on data volume, and distributed replication safeguards data reliability.
Cross-cluster queries: Use Global DataSource and Global View to query and visualize metrics across multiple Kubernetes clusters in a single aggregate view.

High performance

Managed Service for Prometheus outperforms self-hosted Prometheus in both collection throughput and query speed:

6x single-replica throughput: On 2-core CPU / 4 GB memory hardware, each replica collects 6 million time points per scrape, compared to 1 million for open-source Prometheus. The collection component is optimized to improve single-replica throughput while reducing resource consumption.
20x collection efficiency: A lightweight agent architecture requires only a single process per Kubernetes cluster, improving collection performance by 20 times over open-source Prometheus. The Prometheus agent is deployed on the user side, leveraging native collection capabilities to minimize resource usage.
18-22x faster queries: Querying 0.6 billion time points takes 8 to 10 seconds, compared to 180 seconds with open-source Prometheus.
Separation of collection and storage: Decoupling these layers lets each scale independently, removing the single-process bottleneck of open-source Prometheus.
Auto-scaling agents: The number of agent replicas increases or decreases automatically based on workload, distributing collection tasks across replicas.

High availability

Dual-replica architecture: Data collection, processing, and storage components each run with multiple replicas to maintain high availability across core data paths.
Horizontal scaling: Elastic scale-out adjusts capacity based on cluster size. No manual intervention required.
Automatic data retransmission: A built-in retry mechanism preserves data integrity and accuracy, even during transient failures.

Managed Service for Prometheus vs. open-source Prometheus

Capability	Managed Service for Prometheus	Open-source Prometheus
Infrastructure	Fully managed. No servers to purchase or maintain.	Purchase, deploy, and maintain your own infrastructure.
O&M	None required.	Routine O&M required.
High availability	Multi-replica collection and storage with horizontal scaling.	Single process. No horizontal scaling.
Data access	Pre-built integrations for cloud services, databases, middleware, and applications in Java, Go, and other languages. Monitor middleware on Elastic Compute Service (ECS) instances without installing an agent.	Build and maintain an exporter for each component.
Storage capacity	Unlimited cloud storage.	Limited by local disk capacity.
Visualization	Built-in Grafana with ready-to-use monitoring dashboards.	Deploy Grafana and configure dashboards manually.
Alert management	Integrated with the ARMS alert center to improve alert efficiency and accuracy.	Install and configure Alertmanager separately.
Single-replica collection (2-core CPU, 4 GB memory)	6 million time points per scrape.	1 million time points per scrape.
Query performance (0.6 billion time points)	8 to 10 seconds.	180 seconds.
Security	Integrated with Alibaba Cloud security capabilities, including authentication.	Not supported.
Pre-aggregation	Supported.	Not supported.