×
Community Blog MaxCompute Unleashed - Part 10: IF ELSE Branch Statement

MaxCompute Unleashed - Part 10: IF ELSE Branch Statement

Part 10 of the "Unleash the Power of MaxCompute" series introduces the script mode and parameterized views of MaxCompute.

By Haiqing

MaxCompute (formerly known as ODPS) is a leading distributed big data processing platform developed by Alibaba Cloud. It is widely utilized, especially within the Alibaba Group, and supports the core businesses of multiple business units (BUs). ODPS V2.0, continuously optimizing performance, aims to enhance the user experience and expression capabilities of SQL, as well as improve the productivity of ODPS developers.

Building upon the SQL engine of ODPS V2.0, MaxCompute simplifies SQL compilation and enhances the language's expressiveness. We present a series of articles titled Unleash the Power of MaxCompute to explore the capabilities of MaxCompute (ODPS V2.0).

The previous article introduced the script mode, which allows the script to be compiled and submitted as a whole. This generates an execution plan that enables queuing and execution, maximizing the utilization of MaxCompute computing resources. This process provides more optimization opportunities for developers and is more user-friendly for those accustomed to writing in common programming languages.

Furthermore, the script mode supports IF ELSE branch statements, enabling the program to automatically select the execution logic based on conditions. For example, the script can choose different logic based on even or odd dates, or adopt different strategies depending on the number of rows in a table. This feature effectively handles complex SQL statements in various business scenarios, providing SQL developers with more flexibility.

In this article, MaxCompute Studio is used as an example. For detailed information on installing and using MaxCompute Studio, please refer to the installation guides.

IF ELSE Branch Statement

Syntax

The IF syntax of Max Compute is as follows:
IF (condition) BEGIN
  statement 1
  statement 2
  ...
END


IF (condition) BEGIN
  statements
END ELSE IF (condition2) BEGIN
  statements
END ELSE BEGIN
  statements
END

Note:

• If BEGIN and END contain only one statement, you can omit the statement. (similar to '{ }' in Java).

• Multiple IF ELSE statements can be nested within each other.

• There are two types of conditions: expressions and scalar subqueries. Both of them are of the BOOLEAN type.

• The IF ELSE statement of a Boolean expression can be used to determine which branch to execute at compile time.

1

From the execution graph of MaxCompute Studio, you can see that only the src1 branch is executed in the final job.

2

• The IF ELSE statement of a scalar subquery cannot determine which branch to execute at compile time. The compiler does not know the data in the tb_1 table, so it cannot decide whether the condition is true or false, but only at runtime can it be done. Therefore, multiple jobs need to be submitted.

3

The following execution graph shows that two jobs are submitted for the MaxCompute Job.

4

The first calculation

(SELECT count(*) FROM tb_1 ) > 1

5

The second calculates the remaining part.

6

The following figure shows the execution result:

7

Examples

For complex SQL statements, there is often logic that includes IF ELSE. The example is as follows:

select  a.id 
,  greatest(b.c1, c.c1 ) as c1 
,  greatest(b.c2, c.c2 ) as c2
..
from (
select * from ta 
) a  
left outer  join (
select * from  (
     select  tx.id  , ty.c1 ,ty.c2
      (
     select *
     from foo ) tx
     join 
     (
     select *
     from  bar 
     ) ty  on tx.id2=ty.id2
) b  on a.id= b.id 
left  outer  join (
select * from tc
)c  on a.id=c.id ;

Due to business changes, data that can be associated with table b is relatively small. Table b is often empty, but occasionally there is a certain amount of traffic. The table bar has a large amount of data, which means that the cost of foo join bar is high. Therefore, it is expected that the script can implement such logic: when the traffic of table b is 0, change the SQL execution logic to associate only with tc. You can use the IF ELSE branch statement as follows:

@a := select * from ta;
@b :=SELECT  tx.id
        ,ty.c1
        ,ty.c2 ( SELECT * FROM foo ) tx
JOIN    (
            SELECT  *
            FROM    bar
        ) ty
ON      tx.id2 = ty.id2;
@c := select * from tc;

@select_expr table (id bigint ,c1 string,c2 string ...);

IF ( cast( (select count(*) as cnt  from @b ) as int)  == 0 ) BEGIN
@select_expr := select  a.id
,  c. c1
,  c.c2
...
from @a 
left outer  join
@c  on a.id=c.id;
END ELSE BEGIN
@select_expr := select  a.id
,  greatest(b.c1, c.c1 ) as c1
,  greatest(b.c2, c.c2 ) as c2
..
from @a a left outer  join @b  on a.id= b.id
left  outer  join @c  on a.id=c.id;

Summary

Leveraging the SQL engine of ODPS 3.0, MaxCompute introduces IF ELSE branch statements, enhancing the flexibility of programming for developers. The MaxCompute platform is dedicated to continuously enhancing the expressive capabilities of the SQL language.

Stay tuned.

0 1 0
Share on

Alibaba Cloud MaxCompute

137 posts | 20 followers

You may also like

Comments

Alibaba Cloud MaxCompute

137 posts | 20 followers

Related Products