从 SQL 到 SPL:按时间区间汇总

某库表存储时序数据,每条记录间隔若干秒。

dt

payload

2024-01-01 12:00:13

0

2024-01-01 12:00:28

9

2024-01-01 12:00:39

2

2024-01-01 12:00:44

3

2024-01-01 12:00:53

0

2024-01-01 12:01:00

6

2024-01-01 12:01:19

0

2024-01-01 12:01:26

4

2024-01-01 12:01:27

2

2024-01-01 12:01:49

6

2024-01-01 12:02:09

5

2024-01-01 12:02:32

6

2024-01-01 12:02:59

7

2024-01-01 12:03:18

5

2024-01-01 12:03:25

4

2024-01-01 12:03:28

5

2024-01-01 12:03:30

3

2024-01-01 12:03:41

1

2024-01-01 12:04:23

9

2024-01-01 12:04:32

7

2024-01-01 12:04:34

5

2024-01-01 12:04:43

4

2024-01-01 12:04:56

7

2024-01-01 12:05:18

7

2024-01-01 12:05:33

3

2024-01-01 12:05:39

5

2024-01-01 12:05:54

1

2024-01-01 12:05:56

8

2024-01-01 12:06:11

7

2024-01-01 12:06:40

2

2024-01-01 12:06:47

5

2024-01-01 12:06:51

6

2024-01-01 12:06:58

5

2024-01-01 12:07:06

3

2024-01-01 12:07:10

5

2024-01-01 12:07:27

0

2024-01-01 12:07:33

1

2024-01-01 12:07:38

3

2024-01-01 12:08:04

2

2024-01-01 12:08:11

4

2024-01-01 12:08:21

1

2024-01-01 12:08:44

1

2024-01-01 12:08:55

3

2024-01-01 12:09:01

4

2024-01-01 12:09:05

0

2024-01-01 12:09:12

2

2024-01-01 12:09:28

8

2024-01-01 12:09:38

5

2024-01-01 12:10:14

3

2024-01-01 12:10:25

2

2024-01-01 12:10:30

7

2024-01-01 12:10:39

1

2024-01-01 12:10:53

6

2024-01-01 12:11:15

5

2024-01-01 12:11:25

6

2024-01-01 12:11:41

1

2024-01-01 12:11:53

8

2024-01-01 12:11:59

5

2024-01-01 12:12:03

9

2024-01-01 12:12:09

5

2024-01-01 12:12:28

8

2024-01-01 12:12:34

2

2024-01-01 12:12:53

1

2024-01-01 12:13:13

3

2024-01-01 12:13:14

8

2024-01-01 12:13:19

8

2024-01-01 12:13:48

3

2024-01-01 12:14:06

5

2024-01-01 12:14:18

1

2024-01-01 12:14:36

3

2024-01-01 12:14:40

4

2024-01-01 12:14:55

3

现在要每隔一分钟做一次分组汇总,每次汇总 5 分钟的数据。比如第 1、2、3 秒生成 3 条记录,分别对 1-5、2-6、3-7 分钟的数据做汇总。

From

To

payload

2024-01-01 12:00:00

2024-01-01 12:05:00

100

2024-01-01 12:01:00

2024-01-01 12:06:00

110

2024-01-01 12:02:00

2024-01-01 12:07:00

117

2024-01-01 12:03:00

2024-01-01 12:08:00

111

2024-01-01 12:04:00

2024-01-01 12:09:00

104

2024-01-01 12:05:00

2024-01-01 12:10:00

91

2024-01-01 12:06:00

2024-01-01 12:11:00

86

2024-01-01 12:07:00

2024-01-01 12:12:00

86

2024-01-01 12:08:00

2024-01-01 12:13:00

99

2024-01-01 12:09:00

2024-01-01 12:14:00

110

2024-01-01 12:10:00

2024-01-01 12:15:00

107

SQL

SELECT
    [From], DATEADD(MINUTE, 1, [To]) [To], payload
FROM (
    SELECT
        dt, MIN(dt) OVER(ORDER BY dt ROWS BETWEEN 4 PRECEDING AND CURRENT ROW) [From],
        dt [To], SUM(payload) OVER(ORDER BY dt ROWS BETWEEN 4 PRECEDING AND CURRENT ROW) payload
    FROM (
        SELECT
            DATEADD(MINUTE, DATEDIFF(MINUTE, 0, dt), 0) dt, 
SUM(payload) payload
        FROM #tmstmp
        GROUP BY DATEADD(MINUTE, DATEDIFF(MINUTE, 0, dt), 0)
    ) q
) q
WHERE DATEDIFF(MINUTE, [From], [To]) > 3

SQL要用嵌套子查询 + 多个窗口函数实现,代码繁琐。SPL 提供了直接访问位置的语法 https://try.esproc.com/splx?4ql


 A

1

$select * from tmstmp.txt

2

=A1.groups(datetime@m(dt):dt;sum(payload):payload)

3

=A2.new(dt:From, elapse@s(From,300):To, payload[0:4].sum(): payload)

4

=A3.m(:-5)

A1:加载数据。

A2:按分钟分组汇总。

A3:生成新二维表,To 取自当前记录,From 取自当前记录后的 5 分钟,payload 汇总自当前记录到第 4 条记录的区间。

A4:从第 1 条取到倒数第 5 条。

问题来源:https://stackoverflow.com/questions/78152707/get-sum-for-each-5-minute-time-interval