This topic describes the known issues in JindoData 4.X.
JindoData 4.6.X
4.6.2
By default, the fs.oss.checksum.crc64.enable parameter is set to true in JindoSDK 4.6.0 and later versions. This allows you to use CRC-64 to verify the data integrity of write paths.
However, the configuration affects the performance of writing data to OSS-HDFS. If your primary concern is performance, we recommend that you disable CRC-64-based verification. To disable CRC-64-based verification, perform the following operations: Go to the Configure tab of the Hadoop-Common service page in the E-MapReduce (EMR) console. Click the core-site.xml tab and add a configuration item whose key is fs.oss.checksum.crc64.enable and value is false. For more information about how to add a configuration item, see Manage configuration items.
4.6.1
In an EMR cluster with JindoSDK 4.6.1 deployed, when you access OSS-HDFS in password-free mode, an error occurs, which indicates that the access token is to be updated. As a result, multiple jobs are interrupted.
To resolve this issue, use a fixed AccessKey pair or update the version of JindoSDK to 4.6.2 or later. For more information about how to update the version of JindoSDK, see Upgrade JindoSDK in EMR clusters in the new EMR console.
In an EMR cluster with JindoSDK 4.6.1 deployed, when you use JindoUtil in password-free mode, a permission error occurs.
To resolve this issue, use a fixed AccessKey pair or update the version of JindoSDK to 4.6.2 or later. For more information about how to update the version of JindoSDK, see Upgrade JindoSDK in EMR clusters in the new EMR console.
By default, the fs.oss.checksum.crc64.enable parameter is set to true in JindoSDK 4.6.0 and later versions. This allows you to use CRC-64 to verify the data integrity of write paths.
However, the configuration affects the performance of writing data to OSS-HDFS. If your primary concern is performance, we recommend that you disable CRC-64-based verification. To disable CRC-64-based verification, perform the following operations: Go to the Configure tab of the Hadoop-Common service page in the E-MapReduce (EMR) console. Click the core-site.xml tab and add a configuration item whose key is fs.oss.checksum.crc64.enable and value is false. For more information about how to add a configuration item, see Manage configuration items.
4.6.0
In an EMR cluster with JindoSDK 4.6.0 deployed, when you access OSS-HDFS in password-free mode, an error occurs, which indicates that the access token is to be updated. As a result, multiple jobs are interrupted.
To resolve this issue, use a fixed AccessKey pair or update the version of JindoSDK to 4.6.2 or later. For more information about how to update the version of JindoSDK, see Upgrade JindoSDK in EMR clusters in the new EMR console.
In an EMR cluster with JindoSDK 4.6.0 and JindoFSx 4.6.0 deployed and Kerberos authentication enabled, if you set the fs.oss.credentials.provider parameter to com.aliyun.jindodata.oss.auth.RangerCredentialsProvider, a memory leak occurs in JindoFSx Namespace Service.
To resolve this issue, update the versions of JindoFSx and JindoSDK to 4.6.2 or later. For more information, see Upgrade JindoData in EMR clusters in the new EMR console and Upgrade JindoSDK in EMR clusters in the new EMR console.
In an EMR cluster with JindoSDK 4.6.0 deployed, when you use JindoUtil in password-free mode, a permission error occurs.
To resolve this issue, use a fixed AccessKey pair or update the version of JindoSDK to 4.6.2 or later. For more information about how to update the version of JindoSDK, see Upgrade JindoSDK in EMR clusters in the new EMR console.
By default, the fs.oss.checksum.crc64.enable parameter is set to true in JindoSDK 4.6.0 and later versions. This allows you to use CRC-64 to verify the data integrity of write paths.
However, the configuration affects the performance of writing data to OSS-HDFS. If your primary concern is performance, we recommend that you disable CRC-64-based verification. To disable CRC-64-based verification, perform the following operations: Go to the Configure tab of the Hadoop-Common service page in the E-MapReduce (EMR) console. Click the core-site.xml tab and add a configuration item whose key is fs.oss.checksum.crc64.enable and value is false. For more information about how to add a configuration item, see Manage configuration items.
JindoData 4.5.X
4.5.2
In an EMR cluster with JindoSDK 4.5.2 and JindoFSx 4.5.2 deployed and Kerberos authentication enabled, if you set the fs.oss.credentials.provider parameter to com.aliyun.jindodata.oss.auth.RangerCredentialsProvider, a memory leak occurs in JindoFSx Namespace Service.
To resolve this issue, update the versions of JindoFSx and JindoSDK to 4.6.2 or later. For more information, see Upgrade JindoData in EMR clusters in the new EMR console and Upgrade JindoSDK in EMR clusters in the new EMR console.
In an EMR cluster with JindoSDK 4.5.2 deployed, when you use JindoUtil in password-free mode, a permission error occurs.
To resolve this issue, use a fixed AccessKey pair or update the version of JindoSDK to 4.6.2 or later. For more information about how to update the version of JindoSDK, see Upgrade JindoSDK in EMR clusters in the new EMR console.
4.5.1
In an EMR cluster with JindoSDK 4.5.1 and JindoFSx 4.5.1 deployed and Kerberos authentication enabled, if you set the fs.oss.credentials.provider parameter to com.aliyun.jindodata.oss.auth.RangerCredentialsProvider, a memory leak occurs in JindoFSx Namespace Service.
To resolve this issue, update the versions of JindoFSx and JindoSDK to 4.6.2 or later. For more information, see Upgrade JindoData in EMR clusters in the new EMR console and Upgrade JindoSDK in EMR clusters in the new EMR console.
In an EMR cluster with JindoSDK 4.5.1 deployed, when you use JindoUtil in password-free mode, a permission error occurs.
To resolve this issue, use a fixed AccessKey pair or update the version of JindoSDK to 4.6.2 or later. For more information about how to update the version of JindoSDK, see Upgrade JindoSDK in EMR clusters in the new EMR console.
4.5.0
In an EMR cluster with JindoSDK 4.5.0 and JindoFSx 4.5.0 deployed and Kerberos authentication enabled, if you set the fs.oss.credentials.provider parameter to com.aliyun.jindodata.oss.auth.RangerCredentialsProvider, a memory leak occurs in JindoFSx Namespace Service.
To resolve this issue, update the versions of JindoFSx and JindoSDK to 4.6.2 or later. For more information, see Upgrade JindoData in EMR clusters in the new EMR console and Upgrade JindoSDK in EMR clusters in the new EMR console.
In an EMR cluster with JindoSDK 4.5.0 deployed, when you perform a retry after you fail to access Object Storage Service (OSS) or OSS-HDFS in password-free mode, the access token cannot be updated. As a result, multiple jobs are interrupted.
To resolve this issue, use a fixed AccessKey pair or update the version of JindoSDK to 4.6.2 or later. For more information about how to update the version of JindoSDK, see Upgrade JindoSDK in EMR clusters in the new EMR console.
In an EMR cluster with JindoSDK 4.5.0 deployed, when you use JindoUtil in a password-free mode, a permission error occurs.
To resolve this issue, use a fixed AccessKey pair or update the version of JindoSDK to 4.6.2 or later. For more information about how to update the version of JindoSDK, see Upgrade JindoSDK in EMR clusters in the new EMR console.
JindoData 4.4.X
In an EMR cluster with JindoSDK 4.4.0, 4.4.1, or 4.4.2 and JindoFSx 4.4.0, 4.4.1, or 4.4.2 deployed and Kerberos authentication enabled, if you set the fs.oss.credentials.provider parameter to com.aliyun.jindodata.oss.auth.RangerCredentialsProvider, a memory leak occurs in JindoFSx Namespace Service.
To resolve this issue, update the versions of JindoFSx and JindoSDK to 4.6.2 or later. For more information, see Upgrade JindoData in EMR clusters in the new EMR console and Upgrade JindoSDK in EMR clusters in the new EMR console.
In an EMR cluster with JindoSDK 4.4.0 deployed, when you initiate high-concurrency access to OSS or OSS-HDFS in password-free mode, core dumps may occur.
To resolve this issue, use a fixed AccessKey pair or update the version of JindoSDK to 4.6.2 or later. For more information about how to update the version of JindoSDK, see Upgrade JindoSDK in EMR clusters in the new EMR console.
JindoData 4.3.X
In an EMR V3.40.0 or V5.6.0 cluster with JindoSDK 4.3.0 deployed, the time when a directory was last updated cannot be displayed. This is because the performance of the ls command degrades due to the display of the directory time.
To display the time, update the version of JindoSDK to 4.3.1 or later. For more information, see Upgrade JindoSDK in EMR clusters in the new EMR console.
When you use MagicCommitter in an EMR V3.40.0 or V5.6.0 cluster with JindoSDK 4.3.0 deployed, the following error message appears: "Part number must be an integer between 1 and 10000". This is because the uploadPart operation is excessively called.
To resolve this issue, update the version of JindoSDK to 4.3.1 or later. For more information, see Upgrade JindoSDK in EMR clusters in the new EMR console.
When you read cached data in specific paths on a JindoFSx server of version 4.3.0, an error occurs, but the error is not returned to the client. As a result, the client returns invalid data.
To resolve this issue, update the version of JindoFSx to 4.3.1 or later. For more information, see Upgrade JindoData in EMR clusters in the new EMR console.
A JindoFSx server of version 4.3.0 cannot process the commands that are used to preload data to the memory for data caching. As a result, an error may occur when you load data to the memory, and invalid data may be read from the cache.
To resolve this issue, update the version of JindoFSx to 4.3.1 or later. For more information, see Upgrade JindoData in EMR clusters in the new EMR console.
File handle leaks occur on a JindoFSx server of version 4.3.0 or 4.3.1. After the server runs for a long period of time, the number of file handles that are opened by the processes of the operating system may reach the upper limit. As a result, the server cannot open new file handles, and the service is unavailable.
To resolve this issue, update the version of JindoFSx to 4.3.2 or later. For more information, see Upgrade JindoData in EMR clusters in the new EMR console.
JindoData 4.2.X
In JindoSDK 4.2.0, an overflow issue may occur when you call the seek method for large files. As a result, multiple jobs that use the seek method may fail to read large files from OSS.
JindoData 4.1.X
In JindoSDK 4.1.0, an overflow issue may occur when you call the seek method for large files. As a result, multiple jobs that use the seek method may fail to read large files from OSS.
JindoData 4.0.X
In JindoSDK 4.0.0 (EMR V3.39.0 or EMR V5.5.0), an overflow issue may occur when you call the seek method for large files. As a result, multiple jobs that use the seek method may fail to read large files from OSS.
Other issues
JindoSDK does not allow you to write files that are larger than 80 GB in size to OSS.
JindoSDK does not allow you to write data to OSS in append mode.
JindoSDK does not allow OSS to encrypt uploaded data on the client.
JindoSDK does not support the earlier versions of JindoFS in block mode or cache mode.
OSS-HDFS does not allow you to update the earlier versions of JindoFS in block mode.
You can use JindoDistCp to migrate data from an earlier version to the latest version.