UDF樣本：參考資料表資源 - MaxCompute

本文以在MaxCompute用戶端操作為例，為您介紹如何通過Python UDF參考資料表資源。

前提條件

請確認您已完成如下操作：

已安裝並配置MaxCompute用戶端。
更多安裝並配置MaxCompute用戶端資訊，請參見安裝並配置MaxCompute用戶端。
已將待引用的表添加為MaxCompute專案中的資源。
本文已添加的表資源樣本為udf_test，包含的資料如下。
```
+------------+------+
| col1       | col2 |
+------------+------+
| 1          | a    |
| 2          | b    |
| 4          | c    |
| 5          | d    |
+------------+------+
```
更多添加資源操作，請參見添加資源。

開發和使用步驟

1. 開發代碼

Python UDF代碼如下，實現遍曆引用的表資源（例如udf_test）中的資料，返回數組格式的資料。

from odps.udf import annotate
from odps.distcache import get_cache_table
@annotate('->string')
class DistCacheTableExample(object):
    def __init__(self):
        self.records = list(get_cache_table('udf_test'))
        self.counter = 0
        self.ln = len(self.records)
    def evaluate(self):
        if self.counter > self.ln - 1:
            return None
        ret = self.records[self.counter]
        self.counter += 1
        return str(ret)

將上述程式碼範例儲存為PY指令檔（例如table.py），並放置在MaxCompute用戶端的bin目錄中。

2. 上傳資源和註冊函數

完成UDF代碼開發和調試之後，在MaxCompute用戶端中將資源上傳至MaxCompute並註冊函數。

執行如下命令，將PY指令檔上傳為MaxCompute資源。
```
add py table.py;
```
返回結果如下。
```
OK: Resource 'table.py' have been created.
```
更多添加資源命令資訊，請參見添加資源。
執行如下命令，註冊Python UDF，即註冊函數。
```
create function table_udf as 'table.DistCacheTableExample' using 'table.py,udf_test';
```
其中：
- table_udf表示註冊的Python UDF名稱，即後續在SQL語句中調用的自訂函數名稱。
- table.DistCacheTableExample中，table表示table.py指令檔的名稱，DistCacheTableExample為table.py指令檔中定義的類。
返回結果如下。
```
Success: Function 'table_udf' have been created.
```
更多註冊函數資訊，請參見註冊函數。

3. 使用樣本

成功註冊UDF後，執行以下命令，構造測試資料並調用註冊的函數。

--建立測試表。
create table table_test (arg bigint);
--插入資料。
insert into table_test values (1), (4), (15), (123), (7995);
--在SQL語句中調用新註冊的函數。
select table_udf() from table_test;

返回結果如下。

+-----+
| _c0 |
+-----+
| (4, 'c') |
| (5, 'd') |
| (1, 'a') |
| (2, 'b') |
| NULL |
+-----+

MaxCompute：UDF樣本：參考資料表資源

前提條件

開發和使用步驟

1. 開發代碼

2. 上傳資源和註冊函數

3. 使用樣本

相關文檔