Some applications may upload large amounts of data to Object Storage Service (OSS) every day and you may want to perform structured analysis on large text objects. In this case, you can use the external table query feature to load data stored in OSS to MaxCompute for analysis. MaxCompute can analyze large amounts of OSS data within several minutes, which allows you to efficiently explore data value at low costs.
Prerequisites
A bucket is created. For more information, see Create a bucket.
MaxCompute is authorized to access OSS.
After you log on to the RAM console with an Alibaba Cloud account, go to the Cloud Resource Access Authorization page for authorization.
A MaxCompute project is created. For more information, see Create a MaxCompute project.
The MaxCompute client is installed and configured. For more information, see Install and configure the MaxCompute client.
Procedure
Upload the data collected by IoT to OSS.
Prepare the data.
Create a vehicle.csv file on your local computer. The file contains the following sample data:
1,1,51,1,46.81006,-92.08174,9/14/2014 0:00,S 1,2,13,1,46.81006,-92.08174,9/14/2014 0:00,NE 1,3,48,1,46.81006,-92.08174,9/14/2014 0:00,NE 1,4,30,1,46.81006,-92.08174,9/14/2014 0:00,W 1,5,47,1,46.81006,-92.08174,9/14/2014 0:00,S 1,6,9,1,46.81006,-92.08174,9/14/2014 0:00,S 1,7,53,1,46.81006,-92.08174,9/14/2014 0:00,N 1,8,63,1,46.81006,-92.08174,9/14/2014 0:00,SW 1,9,4,1,46.81006,-92.08174,9/14/2014 0:00,NE 1,10,31,1,46.81006,-92.08174,9/14/2014 0:00,N
Upload the vehicle.csv file to the demo/ directory of the examplebucket bucket in the China (Hangzhou) region. For more information, see Upload objects.
Run the MaxCompute client.
For more information, see Run the MaxCompute client.
Create an external table by using MaxCompute. For more information, see Create tables.
Create a non-partitioned table named data_csv_external. Example:
CREATE EXTERNAL TABLE IF NOT EXISTS data_csv_external ( vehicleId int, recordId int, patientId int, calls int, locationLatitute double, locationLongtitue double, recordTime string, direction string ) STORED BY 'com.aliyun.odps.CsvStorageHandler' LOCATION 'oss://oss-cn-hangzhou-internal.aliyuncs.com/examplebucket/demo/';
Query the external table by using MaxCompute.
Run the following SQL statement:
select recordId, patientId, direction from data_csv_external where patientId > 25;
The following output is returned:
+------------+------------+-----------+ | recordId | patientId | direction | +------------+------------+-----------+ | 1 | 51 | S | | 3 | 48 | NE | | 4 | 30 | W | | 5 | 47 | S | | 7 | 53 | N | | 8 | 63 | SW | | 10 | 31 | N | +------------+------------+-----------+
FAQ
What do I do if the Accessing project '<projectname>' failed: ODPS-0420095: Access Denied - Authorization Failed [4002], You don't exist in project <projectname>.
error message is reported?
References
MaxCompute allows you to create an OSS external table in a project to create a mapping between the OSS external table and a directory in OSS. You can use the OSS external table to access unstructured data in the OSS directory or to write data from a MaxCompute project to OSS. For more information, see Create an OSS external table.
You can export data from a MaxCompute table to your local computer for offline viewing. For more information, see Execute SQL statements and export the result data.
If you no longer need the table or MaxCompute project, you can delete the table or MaxCompute project to avoid unnecessary resource waste and fees. For more information, see Delete a table or a MaxCompute project.