SQL 如何将前导行的数据填入后继行的空值 *
有数据库表HEADER,数据如下所示:
NAME
ENG1
ENG2
ENG3
UPDATEBY
UPDATEDDATE
adobe
Alex
Bob
John
2020-09-29 13:10:00
adobe
Clint
John
2020-09-29 13:12:00
adobe
Mary
John
2020-09-29 13:14:00
adobe
Sheila
John
2020-09-29 14:10:00
ENG1, ENG 2, ENG 3,互相无关,可记为ENGX,有些值为空。现在按日期(UPDATEDDATE)排序后,如果当前记录的ENGX为空,则将上一条记录的 EngX 填充到本条,结果如下所示:
NAME
ENG1
ENG2
ENG3
UPDATEBY
UPDATEDDATE
adobe
Alex
Bob
John
2020-09-29 13:10:00
adobe
Clint
Alex
Bob
John
2020-09-29 13:12:00
adobe
Clint
Mary
Bob
John
2020-09-29 13:14:00
adobe
Sheila
Mary
Bob
John
2020-09-29 14:10:00
SQL Server的SQL:
SELECT H.*, MAX(ENG1) OVER (PARTITION BY NAME, GRP1 ) AS IMPUTED_ENG1
, MAX(ENG2) OVER (PARTITION BY NAME, GRP2 ) AS IMPUTED_ENG2
, MAX(ENG3) OVER (PARTITION BY NAME, GRP3 ) AS IMPUTED_ENG3
FROM (
SELECT H.*, COUNT(ENG1) OVER (PARTITION BY NAME ORDER BY UPDATEDDATE) AS GRP1
, COUNT(ENG2) OVER (PARTITION BY NAME ORDER BY UPDATEDDATE) AS GRP2
, COUNT(ENG3) OVER (PARTITION BY NAME ORDER BY UPDATEDDATE) AS GRP3
FROM HEADER H
) H;
这道题按自然思维很容易想出解决办法,数据按日期(UPDATEDDATE)排序后,对于那些ENGX列,如果当前值为空,将该列的上一个值赋予当前值即可。但是在SQL中的集合是无序的,需要借助窗口函数,给那些ENGX列分别做上标记,再利用标记填空,SELECT两次,非常麻烦。
用开源集算器的SPL就很容易写:
A |
|
1 |
=connect("mssql") |
2 |
=A1.query@x("SELECT * FROM HEADER ORDER BY UPDATEDDATE") |
3 |
>A2.run(ENG1=ifn(ENG1,ENG1[-1]), ENG2=ifn(ENG2,ENG2[-1]), ENG3=ifn(ENG3,ENG3[-1])) |
4 |
return A2 |
SPL直接支持有序集合,很容易实现相邻数据间的计算。
英文版