从 TPCH 测试学习性能优化技巧之 Q7

一、     查询要求

Q7 语句是查询得到在 1995 年和 1996 年间,零件从一国供应商被运送给另一国的顾客,两国货运项目总的折扣后收入。查询结果列出供应商国家、顾客国家、年度、那一年的收入,并按供应商国家、顾客国家和年度升序排列。

Q7语句的特点是:带有分组、排序、聚集、子查询操作并存的多表查询操作。子查询的父层查询不存在其他查询对象,是格式相对简单的子查询。

 

二、     Oracle执行

Oracle编写的查询SQL语句如下:

select  /*+ parallel(n) */

         supp_nation,

         cust_nation,

         l_year,

         sum(volume) as revenue

from

         (

                   select

                            n1.n_name as supp_nation,

                            n2.n_name as cust_nation,

                            extract(year from l_shipdate) as l_year,

                            l_extendedprice * (1 - l_discount) as volume

                  from

                            supplier,

                            lineitem,

                            orders,

                            customer,

                            nation n1,

                            nation n2

                   where

                            s_suppkey = l_suppkey

                            and o_orderkey = l_orderkey

                            and c_custkey = o_custkey

                            and s_nationkey = n1.n_nationkey

                            and c_nationkey = n2.n_nationkey

                            and (

                                     (n1.n_name = 'CHINA' and n2.n_name = 'RUSSIA')

                                     or (n1.n_name = 'RUSSIA' and n2.n_name = 'CHINA')

                            )

                            and l_shipdate between date '1995-01-01' and date '1996-12-31'

         ) shipping

group by

         supp_nation,

         cust_nation,

         l_year

order by

         supp_nation,

         cust_nation,

         l_year;

其中/*+ parallel(n) */ Oracle的并行查询语法,n是并行数。

脚本执行时间,单位:秒

并行数

1

2

4

8

12

Oracle

510

344

256

211

184

 

三、     SPL优化

中间子查询的运算和Q3类似,优化原理也类似,这里就不再赘述。

 

SPL脚本如下:


A

1

=now()

2

1995-01-01

3

1996-12-31

4

>name1="CHINA"

5

>name2="RUSSIA"

6

=file("nation.btx").import@b().select(N_NAME==name1   || N_NAME==name2).derive@o().keys@i(N_NATIONKEY)

7

=file("supplier.ctx").open().cursor@m(S_SUPPKEY,S_NATIONKEY;S_NATIONKEY:A6).fetch().keys@im(S_SUPPKEY)

8

=file("customer.ctx").open().cursor@m(C_CUSTKEY,C_NATIONKEY;C_NATIONKEY:A6).fetch().keys@im(C_CUSTKEY)

9

=file("orders.ctx").open().cursor@m(O_ORDERKEY,O_CUSTKEY;O_CUSTKEY:A8)

10

=file("lineitem.ctx").open().news(A9,L_ORDERKEY,L_SUPPKEY,L_EXTENDEDPRICE,L_DISCOUNT,L_SHIPDATE,O_CUSTKEY;L_SHIPDATE>=A2   && L_SHIPDATE <=A3,L_SUPPKEY:A7)

11

=A10.select(O_CUSTKEY.C_NATIONKEY!=L_SUPPKEY.S_NATIONKEY)

12

=A11.groups(    L_SUPPKEY.S_NATIONKEY.N_NAME:supp_nation,O_CUSTKEY.C_NATIONKEY.N_NAME:cust_nation,year(L_SHIPDATE):l_year;   sum(L_EXTENDEDPRICE*(1-L_DISCOUNT)):   volume)

13

return interval@ms(A1,now())

注意nation表数据A6A7A8分别使用了一次,用于外键匹配过滤,这和SQL的别名写法不同。

 

脚本执行时间,单位:秒

并行数

1

2

4

8

12

Oracle

510

344

256

211

184

SPL组表

250

126

66

34

25