All Products
Search
Document Center

E-MapReduce:Overview

Last Updated:Dec 21, 2023

JindoCache, which is formerly called JindoFSx, is a service provided by Alibaba Cloud E-MapReduce (EMR) to accelerate access to cloud-native data lakes. JindoCache provides acceleration features, such as data caching and metadata caching, and provides different read and write policies based on the CacheSet that is used. This helps meet the requirements of data lakes on access acceleration in different scenarios.

Background information

CacheSet is the cache abstraction of JindoCache. In actual scenarios, caching-based acceleration is not required for some data. JindoCache provides fine-grained access policies based on diversified computing requirements and use scenarios of data lakes. You can configure access policies based on your business requirements. You can select an aggressive metadata caching policy or you can choose not to cache specific data to implement optimal performance and resource utilization.

Scenarios

JindoCache can be used in the following scenarios:

  • Presto queries in online analytical processing (OLAP) scenarios: Improves query performance and shortens the query time.

  • Use of HBase in DataServing scenarios: Significantly reduces the P99 latency and request fees.

  • Hive or Spark reports for big data analysis: Reduces the amount of time that is required to generate a report and the costs of computing clusters.

  • Lake and warehouse integration: Reduces request fees and the response latency of data catalogs.

  • Artificial intelligence (AI): Accelerates training and AI operations in other scenarios, reduces the costs of AI clusters, and provides comprehensive capability support.

Caching policies

JindoCache supports data caching and metadata caching. Data caching types include distributed data caching, consistent hashing-based data caching, and local caching.