SmartData is a storage service for the E-MapReduce (EMR) Jindo engine. SmartData provides centralized storage and optimized caching and computing for EMR computing engines and extends storage features. SmartData consists of JindoFS, JindoTable, and related tools. This topic describes the updates in SmartData 3.5.X.
OSS storage scalability on JindoFS
The performance of deleting Object Storage Service (OSS) directories is optimized.
JindoSDK
- If JindoSDK for Java is used, JindoSDK logs are generated as Java logs to improve diagnostic performance.
- Statistical logs of the memory used by JindoSDK are generated. You can view the logs to obtain information about the memory used by JindoSDK.
JindoTable-based computing optimization
- The native query acceleration feature is added. It accelerates queries when you use Spark, Hive, or Presto to read data from ORC or Parquet files that are stored in OSS or JindoFS. For more information, see Enable query acceleration based on a native engine.
- The infrequent-access statistics on Hive tables can be collected in JindoTable. For more information, see Use JindoTable to collect infrequent-access statistics of tables and partitions.
Other JindoFS tools
Jindo DistCp is enhanced. The CloudMonitor service can be used to monitor and alert failed tasks. The dependency on Advanced Vector Extensions (AVX) is removed. Cold Archive is supported when data is written to OSS. For more information, see Use Jindo DistCp.