TableTunnel is an entry class of the MaxCompute Tunnel service. You can use TableTunnel to upload or download only table data. Views cannot be uploaded or downloaded.
Definition
public class TableTunnel {
public DownloadSession createDownloadSession(String projectName, String tableName);
public DownloadSession createDownloadSession(String projectName, String tableName, PartitionSpec partitionSpec);
public UploadSession createUploadSession(String projectName, String tableName,boolean overwrite);
public UploadSession createUploadSession(String projectName, String tableName, PartitionSpec partitionSpec,boolean overwrite);
public DownloadSession getDownloadSession(String projectName, String tableName, PartitionSpec partitionSpec, String id);
public DownloadSession getDownloadSession(String projectName, String tableName, String id);
public UploadSession getUploadSession(String projectName, String tableName, PartitionSpec partitionSpec, String id);
public UploadSession getUploadSession(String projectName, String tableName, String id);
}
Implementation process
- The
RecordWriter.write()
method uploads your data as files to a temporary directory. - The
RecordWriter.close()
method moves the files from the temporary directory to a data directory. - The
session.commit()
method moves all files from the data directory to the directory in which the required table is saved, and updates the table metadata. This way, the data moved into a table by the current job is visible to other MaxCompute jobs such as SQL and MapReduce jobs.
Limits
- The value of a block ID must be greater than or equal to 0 but less than 20000. The size of the data that you want to upload in a block cannot exceed 100 GB.
- A session is uniquely identified by its ID. The lifecycle of a session is 24 hours. If your session times out due to the transfer of large amounts of data, you must transfer your data in multiple sessions.
- The lifecycle of an HTTP request that corresponds to a RecordWriter is 120 seconds.
If no data flows over an HTTP connection within 120 seconds, the server closes the
connection.
HTTP has an 8 KB buffer. When you call the
RecordWriter.write()
method, your data may be saved to the buffer and no inbound traffic flows over the HTTP connection. In this case, you can call theTunnelRecordWriter.flush()
method to forcibly flush data from the buffer. - If you use a RecordWriter to write logs to MaxCompute, the write operation may time
out due to unexpected traffic fluctuations. We recommend that you perform the following
operations:
- We recommend that you do not use a RecordWriter for each data record. If you use a RecordWriter for each data record, a large number of small files are generated, because each RecordWriter corresponds to a file. This affects the performance of MaxCompute.
- If the size of cached code reaches 64 MB, we recommend that you use a RecordWriter to write multiple data records at a time.
- The lifecycle of a RecordReader is 300 seconds.
- If the endpoint that is used to access the MaxCompute service is a public endpoint, the users are charged for data downloads. For more information about public endpoints, see Endpoints.
- If the download control feature is enabled, users who use public endpoints to download data must have the related download permissions. For more information about the authorization, see Download control.