从 SQL 到 SPL:递归关系下的合计
某数据库有票据表和工时表,票据表存储了每张票据和父票据的关系,形成了自关联结构:
ticketid |
parentID |
1 |
6 |
2 |
7 |
3 |
8 |
4 |
9 |
5 |
10 |
6 |
18 |
7 |
19 |
8 |
20 |
9 |
21 |
10 |
22 |
11 |
23 |
12 |
18 |
13 |
19 |
14 |
20 |
15 |
21 |
16 |
22 |
17 |
23 |
18 |
24 |
19 |
25 |
20 |
26 |
21 |
27 |
22 |
28 |
23 |
29 |
24 |
30 |
25 |
30 |
26 |
30 |
27 |
30 |
28 |
30 |
29 |
30 |
30 |
0 |
工时表存储了每张票据对应的多个工时:
ticketid |
hours |
0 |
4 |
1 |
2 |
1 |
1 |
2 |
5 |
2 |
1 |
3 |
5 |
3 |
2 |
3 |
2 |
4 |
1 |
4 |
4 |
4 |
1 |
4 |
5 |
5 |
3 |
5 |
2 |
6 |
3 |
6 |
3 |
6 |
2 |
6 |
1 |
6 |
1 |
7 |
1 |
7 |
2 |
7 |
4 |
7 |
2 |
8 |
5 |
8 |
2 |
8 |
4 |
9 |
5 |
9 |
2 |
10 |
5 |
10 |
4 |
10 |
4 |
11 |
1 |
11 |
1 |
12 |
3 |
12 |
2 |
12 |
5 |
13 |
2 |
13 |
2 |
14 |
5 |
14 |
1 |
14 |
2 |
14 |
1 |
15 |
5 |
15 |
5 |
15 |
1 |
15 |
3 |
16 |
2 |
16 |
1 |
16 |
5 |
17 |
4 |
17 |
4 |
18 |
1 |
18 |
5 |
18 |
1 |
19 |
1 |
19 |
1 |
19 |
4 |
19 |
3 |
20 |
2 |
20 |
2 |
20 |
5 |
21 |
3 |
21 |
5 |
22 |
3 |
22 |
4 |
22 |
5 |
23 |
3 |
23 |
2 |
23 |
5 |
23 |
4 |
24 |
4 |
24 |
4 |
24 |
1 |
24 |
4 |
25 |
2 |
25 |
4 |
25 |
3 |
26 |
2 |
26 |
2 |
26 |
4 |
26 |
2 |
26 |
2 |
27 |
2 |
27 |
3 |
28 |
5 |
28 |
1 |
28 |
3 |
28 |
1 |
29 |
1 |
29 |
4 |
30 |
0 |
30 |
0 |
现在要计算出每张票据的工时,并递归地计算出该票据和所有下级子票据的工时之和,即总工时。
MainID |
Direct_hours |
Total_hours |
1 |
3 |
3 |
2 |
6 |
6 |
3 |
9 |
9 |
4 |
11 |
11 |
5 |
5 |
5 |
6 |
10 |
13 |
7 |
9 |
15 |
8 |
11 |
20 |
9 |
7 |
18 |
10 |
13 |
18 |
11 |
2 |
2 |
12 |
10 |
10 |
13 |
4 |
4 |
14 |
9 |
9 |
15 |
14 |
14 |
16 |
8 |
8 |
17 |
8 |
8 |
18 |
7 |
30 |
19 |
9 |
28 |
20 |
9 |
38 |
21 |
8 |
40 |
22 |
12 |
38 |
23 |
14 |
24 |
24 |
13 |
43 |
25 |
9 |
37 |
26 |
12 |
50 |
27 |
5 |
45 |
28 |
10 |
48 |
29 |
5 |
29 |
30 |
0 |
252 |
SQL:
WITH CTE_TREE AS (
SELECT parentid AS parentid, ticketid AS children
FROM tickets t
WHERE parentID <> 0
UNION
SELECT parentid, NULL
FROM tickets
WHERE parentID <> 0
UNION
SELECT ticketid, NULL
FROM tickets
)
, CTE_TRAVERSE AS (
SELECT parentid AS mainId, children AS nextParent
FROM CTE_TREE
UNION ALL
SELECT t.mainId, tree.children
FROM CTE_TREE tree
INNER JOIN CTE_TRAVERSE t
ON t.nextParent = tree.parentid
WHERE tree.children <> ''
)
SELECT t.MainID
, SUM(CASE WHEN t.nextparent IS NULL THEN h.Hours END) AS Direct_hours
, SUM(h.Hours) AS Total_hours
FROM CTE_TRAVERSE t
INNER JOIN Hours h
ON h.ticketid = t.nextparent
OR (h.ticketid = t.mainID AND t.nextparent IS NULL)
GROUP BY t.mainId
SQL 要用多个子查询实现自关联和递归关系,代码复杂难懂。SPL 直接提供了引用函数可建立自关联,提供了递归函数可取所有下级节点:https://try.esproc.com/splx?2SJ
A |
|
1 |
$select t.ticketid ticketid,t.parentID parentID,h.hours hours from tickets.txt t left join (select ticketid,sum(hours) hours from hours.txt group by ticketid) h on t.ticketid=h.ticketid |
2 |
>A1.switch(parentID,A1:ticketid) |
3 |
=A1.new(ticketid:MainID,hours:Direct_hours,hours+A1.nodes(parentID,~).sum(hours):Total_hours) |
A1:加载数据,在票据表的基础上,计算出每张票据的直接工时。
A2:让每条记录的父票据字段指向父票据记录,建立自关联关系。函数 switch 可以将字段值修改成记录。
A3:新建二维表,当前票据的总工时 = 当前票据的直接工时 + 当前票据所有的下级子票据的工时之和。函数 nodes 可以递归地计算出某条记录所有的下级记录。
英文版 https://c.esproc.com/article/1743072855312