组内再分组汇总并取前 N 名后合并
【问题】
Stuck on Sql Query with time interval
I want to get best 3 day of users between “2014-07-01” and “2014-08-01”
Could someone help me? I’ve been stuck here for 3 days.
In real score table entries are 10:00 to 22:00 and 1 entries for each hour.
Total of 12 entry for each day and each player (sometimes it could be less 1 or 2).
This is the output I’m trying to get:
ID | User_ID | Username | Sum(Score) | Date
\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-
1 | 1 | Xxx | 52| 2014-07-01
2 | 1 | Xxx | 143| 2014-07-02
3 | 2 | Yyy | 63| 2014-07-01
...
Scoretable:
ID | User_ID | Score | Datetime
\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-
1 | 1| 35| 2014-07-0111:00:00
2 | 1| 17| 2014-07-0112:00:00
3 | 2| 36| 2014-07-0111:00:00
4 | 2| 27| 2014-07-0112:00:00
5 | 1| 66| 2014-07-0211:00:00
6 | 1| 77| 2014-07-0212:00:00
7 | 2| 93| 2014-07-0212:00:00
...
Usertable :
ID | Username
\-\-\-\-\-\-\-\-\-\-\-\-\-\-
1 | Xxx
2 | Yyy
3 | Zzz
...
别人的回答:
I think you need to aggregate first by date, and then choose the first three usingrow_number(). To do the aggregation:
select s.user_id, sum(s.datetime,'day')as theday, sum(score)as score,
row_number()over(partitionby s.user_id orderby sum(score)desc)as seqnum
from scores s
groupby s.user_id;
To get the rest of the information, use this as a subquery or CTE:
select u.*, s.score
from(select s.user_id, sum(s.datetime,'day')as theday, sum(s.score)as score,
row_number()over(partitionby s.user_id orderby sum(s.score)desc)as seqnum
from scores s
groupby s.user_id
)s join
users u
on s.user_id = u.users_id
whereseqnum <=3
orderby u.user_id, s.score desc;
【回答】
这是个比较典型的组内计算,解决思路很清晰:
1. 将数据按 User_ID 分成多个组,每个组是一个用户的全部数据。
2. 组内运算,将每组数据按日期再分组,并汇总出每日的总分。
3. 组内运算,在每组数据中求得总分前三名的记录。
4. 将所有数据合并。
上述思路虽然清晰,但用 SQL 却很难表达组内运算,所以你“Stuck on Sql query”。这种情况建议采用 SPL 来辅助解决,SPL 可以方便地表达组内运算,代码如下:
A |
|
1 |
=dataSource.query("select s.ID ID,s.User_ID User_ID,u.Username Username,s.Score Score,s.Datetime Datetime from Scoretable s join Usertable u on s.User_ID = u.ID") |
2 |
=A1.group(User_ID) |
3 |
=A2.(~.groups(User_ID,Username,date(Datetime):day; sum(Score):sumScore)) |
4 |
=A3.(~.top(-3;sumScore)) |
5 |
=A4.conj() |
1. 上面的“~”代表的是“每组数据”。
2.A2 中的的分组不用聚合,所以用 group 函数。
3.A3 中的分组需要聚合,所以用 groups 函数。
4. 上述代码和 JAVA 或报表很容易集成,
更多内容请参看:【集算器简化 SQL 式计算之组内运算】