Managed Service for Prometheus is a fully managed, Prometheus-compatible monitoring service that lets you monitor a wide variety of components and collect metrics from mainstream open-source infrastructures. It provides the same open-source Prometheus experience -- PromQL, Grafana dashboards, and native collection rules -- without the operational overhead of self-hosted Prometheus infrastructure.
Out-of-the-box monitoring
Set up monitoring in minutes instead of days:
Instant Kubernetes monitoring: Create a Prometheus instance for any Kubernetes cluster or cloud resource directly from the Application Real-Time Monitoring Service (ARMS) console. A container monitoring setup that typically takes 3 days to build from scratch takes 10 minutes with Managed Service for Prometheus.
Built-in alerting and dashboards: Pre-configured Grafana dashboards and alerting rules cover common application components and cloud services. No manual dashboard creation or Alertmanager installation required.
Managed operations: Health inspection, automatic agent upgrades, and visualized configuration are built in, reducing the operational burden of self-hosted Prometheus.
Cost-effective
Free Kubernetes metrics: Monitor Kubernetes components at no cost with a set of included metrics.
No infrastructure to manage: The service is fully managed. You do not need to purchase or maintain additional servers, and nearly all O&M overhead is eliminated.
Fast time to value: Integrates directly with Container Service for Kubernetes (ACK). Start monitoring ACK clusters without building a separate monitoring stack.
Compatible with open-source Prometheus
If you already run Prometheus, your existing configuration works here:
Native collection rules: Supports
Prometheus.yamlscrape configs, ServiceMonitor custom resources, and annotation-based auto-discovery. ServiceMonitor is especially useful for monitoring custom Kubernetes clusters.Standard query and data model: Supports PromQL queries, custom multi-dimensional data models, and the HTTP API -- the same interfaces you already use with open-source Prometheus.
Flexible service discovery: Static file configurations and dynamic discovery mechanisms make it straightforward to migrate existing monitoring setups, integrate new data sources, and discover monitoring targets.
Scalable cloud storage
No storage ceiling: Cloud-based distributed storage scales with your data. There is no upper limit on data volume, and distributed replication safeguards data reliability.
Cross-cluster queries: Use Global DataSource and Global View to query and visualize metrics across multiple Kubernetes clusters in a single aggregate view.
High performance
Managed Service for Prometheus outperforms self-hosted Prometheus in both collection throughput and query speed:
6x single-replica throughput: On 2-core CPU / 4 GB memory hardware, each replica collects 6 million time points per scrape, compared to 1 million for open-source Prometheus. The collection component is optimized to improve single-replica throughput while reducing resource consumption.
20x collection efficiency: A lightweight agent architecture requires only a single process per Kubernetes cluster, improving collection performance by 20 times over open-source Prometheus. The Prometheus agent is deployed on the user side, leveraging native collection capabilities to minimize resource usage.
18-22x faster queries: Querying 0.6 billion time points takes 8 to 10 seconds, compared to 180 seconds with open-source Prometheus.
Separation of collection and storage: Decoupling these layers lets each scale independently, removing the single-process bottleneck of open-source Prometheus.
Auto-scaling agents: The number of agent replicas increases or decreases automatically based on workload, distributing collection tasks across replicas.
High availability
Dual-replica architecture: Data collection, processing, and storage components each run with multiple replicas to maintain high availability across core data paths.
Horizontal scaling: Elastic scale-out adjusts capacity based on cluster size. No manual intervention required.
Automatic data retransmission: A built-in retry mechanism preserves data integrity and accuracy, even during transient failures.
Managed Service for Prometheus vs. open-source Prometheus
Capability | Managed Service for Prometheus | Open-source Prometheus |
Infrastructure | Fully managed. No servers to purchase or maintain. | Purchase, deploy, and maintain your own infrastructure. |
O&M | None required. | Routine O&M required. |
High availability | Multi-replica collection and storage with horizontal scaling. | Single process. No horizontal scaling. |
Data access | Pre-built integrations for cloud services, databases, middleware, and applications in Java, Go, and other languages. Monitor middleware on Elastic Compute Service (ECS) instances without installing an agent. | Build and maintain an exporter for each component. |
Storage capacity | Unlimited cloud storage. | Limited by local disk capacity. |
Visualization | Built-in Grafana with ready-to-use monitoring dashboards. | Deploy Grafana and configure dashboards manually. |
Alert management | Integrated with the ARMS alert center to improve alert efficiency and accuracy. | Install and configure Alertmanager separately. |
Single-replica collection (2-core CPU, 4 GB memory) | 6 million time points per scrape. | 1 million time points per scrape. |
Query performance (0.6 billion time points) | 8 to 10 seconds. | 180 seconds. |
Security | Integrated with Alibaba Cloud security capabilities, including authentication. | Not supported. |
Pre-aggregation | Supported. | Not supported. |