从 TPCH 测试学习性能优化技巧之 Q18

一、     查询要求

Q18语句查询获得比指定供货量大的供货商信息。可用于决定在订单量大,任务紧急时,验证是否有充足的供货商。

Q18语句的特点是:带有分组、排序、聚集、IN子查询操作并存的三表连接操作。查询语句没有从语法上限制返回多少条元组,但是TPC-H标准规定,查询结果只返回前100行(通常依赖于应用程序实现)。

 

二、     Oracle执行

Oracle编写的查询SQL语句如下:

select * from (

         select  /*+ parallel(n) */

                   c_name,c_custkey,o_orderkey,o_orderdate,o_totalprice,

                   sum(l_quantity)

         from

                   customer,orders,lineitem

         where

                   o_orderkey in (

                            select

                                     l_orderkey

                            from

                                     lineitem

                            group by

                                      l_orderkey

having

                                     sum(l_quantity) > 314

                   )

                   and c_custkey = o_custkey

                   and o_orderkey = l_orderkey

         group by

                   c_name,

                   c_custkey,

                   o_orderkey,

                   o_orderdate,

                   o_totalprice

         order by

                   o_totalprice desc,

                   o_orderdate

) where rownum<=100;

其中/*+ parallel(n) */ Oracle的并行查询语法,n是并行数。

脚本执行时间,单位:秒

并行数

1

2

4

8

12

Oracle

1248

739

533

412

344

 

三、     SPL优化

分析这个查询,如果我们将下面的子查询

              select l_orderkey,sum(l_quantity) lq

              from lineitem

              group by l_orderkey

命名为视图lo,则原查询的主体等价于:

         select  /*+ parallel(n) */

                   c_name,c_custkey,o_orderkey,o_orderdate,o_totalprice,

                   sum(lq)

         from

                   customer,orders,lo

         where

                   c_custkey = o_custkey

                   and o_orderkey = l_orderkey

                   and lq>314

         group by

                   c_name,

                   c_custkey,

                   o_orderkey,

                   o_orderdate,

                   o_totalprice

这是一个有外键关联的表orders与其同维表lo的连接运算,而我们知道lineitemorders的子表,也是按l_orderkey排序的,用lineitem计算出来的lo仍然可以保证按l_orderkey有序,这样可以与orders做高速归并连接。

customer作为外键表在结果集上与orders关联即可,前面运算可以不参与。

 

SPL脚本如下:


A

1

=1

2

=now()

3

>quantity=314

4

=file(path+"lineitem.ctx").create().cursor@m(L_ORDERKEY,L_QUANTITY;;A1)

5

=A4.group(L_ORDERKEY;sum(L_QUANTITY):quantities).select(quantities>quantity).fetch()

6

=file(path+"orders.ctx").create()

7

=A5.joinx@q(L_ORDERKEY,A6:O_ORDERKEY,O_CUSTKEY,O_TOTALPRICE,O_ORDERDATE)

8

=A7.groups@u(;top(100,[-O_TOTALPRICE,O_ORDERDATE],~):v).v

9

=A8.id(O_CUSTKEY)

10

=file(path+"customer.ctx").create().cursor@m(C_CUSTKEY,C_NAME;A9.contain(C_CUSTKEY);A1)

11

=A10.fetch()

12

=A8.switch(O_CUSTKEY,A11:C_CUSTKEY)

13

=A12.new(O_CUSTKEY.C_NAME:c_name,O_CUSTKEY.C_CUSTKEY:c_cuskey,L_ORDERKEY:o_orderkey,O_ORDERDATE,O_TOTALPRICE,quantities)

14

=now()

15

=interval@s(A2,A14)

 

脚本执行时间,单位:秒

并行数

1

2

4

8

12

Oracle

1248

739

533

412

344

SPL组表

152

76

38

21

16