从 SQL 到 SPL:从两个表中找出每条记录日期的最接近的匹配项

SQL Server 有两个表,Table1:

ID

JoiningDt

DocNum

1

2024-04-05

A123

2

2024-04-06

A123

3

2024-04-04

B123

Table2:

DocNum

ClosestDt

A123

03-Apr-2024

A123

04-Apr-2024

A123

07-Apr-2024

A123

08-Apr-2024

B123

02-Apr-2024

B123

05-Apr-2024

现在要将 Table1 按 ID 排序,依次遍历每条记录:取出 Table2 中与当前记录的 DocNum 字段相同,但时间稍晚的那一条记录(所有更晚的记录中最早的)。特别的规则在于,从 Table2 中取过的记录下次不能再取。

ID

JoiningDt

DocNum

ClosestDt

1

2024-04-05

A123

2024-04-07

2

2024-04-06

A123

2024-04-08

3

2024-04-04

B123

2024-04-05

SQL:

WITH CTE1 As (
SELECT t1.ID, t1.JoiningDt, t1.DocNum,
  (SELECT TOP 1 ClosestDt FROM Table2 
  WHERE DocNum = t1.DocNum AND ClosestDt > t1.JoiningDt ORDER BY ClosestDt  ASC   ) ClosestDt
FROM Table1 t1
 ), CTE2 AS (
  SELECT
       ID, JoiningDt, DocNum, ClosestDt
    , ROW_NUMBER() OVER(PARTITION BY DocNum,   ClosestDt ORDER BY ID) rn
  FROM CTE1
  )
SELECT ID, JoiningDt, DocNum, 
  CASE WHEN rn = 1 then ClosestDt ELSE
  (SELECT ClosestDt FROM Table2 
  WHERE DocNum = c1.DocNum AND ClosestDt > c1.JoiningDt ORDER BY ClosestDt  ASC   
  OFFSET c1.rn -1  ROWS FETCH NEXT 1 ROWS ONLY) END as ClosestDt
  FROM CTE2 c1

这里要进行有序计算,特别是要实现取过不能再取的规则,SQL 要造序号和标志位,还要用多层嵌套配合 join 语句来变相实现,代码繁琐难懂;改用存储过程会直观些,但代码更长,结构变复杂。SPL 可以直接按业务逻辑实现。https://try.esproc.com/splx?2HC


 A

1

$select * from Table1.txt order by ID

2

$select * from Table2.txt order by ClosestDt

3

=A1.derive(A2.delete@n([A2.select@1(DocNum==A1.DocNum && ClosestDt>A1.JoiningDt)]).ClosestDt)

A1-A2:加载数据。

A3:循环处理 A1 的每条记录,追加新字段,业务逻辑是:从 A2 取 DocNum 与当前记录相同但更晚的那些记录,从中取第 1 条的 ClosestDt,再从 A2 中删掉这条记录以避免再取。函数 select@1 表示取符合条件的第 1 条记录。

问题来源:https://stackoverflow.com/questions/78206862/find-the-distinct-closest-date-match-for-each-record-from-2-tables