行转列,动态列枚举分组

行转列,动态列枚举分组

【问题】

Hi All,

Thanks for the wonderful support the community gets from this forum.

I am trying to accomplish this in MongoDB. Didn’t think it could get this complicated. thought the problem was interesting to solve.

I am trying to get a count of students by scores for various subjects.

the example below shows 2 subjects, in reality we could run this ad-hoc query for 1 or more subjects. (so cannot “can” and should be real-time)

(for e.g. run the query for a bunch of schools and a bunch of subjects)

the grades are always 1-5 no decimal point

use students

db.studentsummary.insert ({school:‘atl1’, sname : ‘Sean’ , sub1: 4, sub2 :5})

db.studentsummary.insert ({school:‘atl1’, sname : ‘chris’ , sub1: 4, sub2 :3})

db.studentsummary.insert ({school:‘atl1’, sname : ‘becky’ , sub1: 5, sub2 :4})

db.studentsummary.insert ({school:‘atl1’, sname : ‘sam’ , sub1: 5, sub2 :4})

db.studentsummary.insert ({school:‘atl2’, sname : ‘dustin’ , sub1: 2, sub2 :2})

db.studentsummary.insert ({school:‘atl2’, sname : ‘greg’ , sub1: 3, sub2 :4})

db.studentsummary.insert ({school:‘atl2’, sname : ‘peter’ , sub1: 5, sub2 :1})

db.studentsummary.insert ({school:‘atl2’, sname : ‘brad’ , sub1: 2, sub2 :2})

db.studentsummary.insert ({school:‘atl2’, sname : ‘liz’ , sub1: 3, sub2 :null})

Desired Output:(Would like to see how close we could get to the desired output below)

show how many got a 5; how many got a 4 and so on…

I tried quite a bit - trying to group by each subject and run different pipelines based on the subjects chosen for query and let the front end manage the merge and pivot , performance was unacceptable. what not.

help will be very highly appreciated.

【回答】

Mongodb 不直接支持行转列,枚举分组做起来也很麻烦,要把数据读出来再用 Java、Php 等编程语言来实现;还涉及到动态列,实现这样的集合运算也很麻烦。使用 SPL 完成这个需求更容易。以下例子不使用 MongoDB 展现,如果使用 MongoDB 需要用 SPL 的 mongo_open()函数连接数据库,用 mongo_shell() 函数查询原始数据结构

A B C
1 =mongo_open(“mongodb://localhost:27017/local?user=test&password=test”)
2 =mongo_shell(A1,“student.find()”)
3 =[5,4,3,2,1] >subs=[“sub1”,“sub2”]
4 =A2.group(school) >newfields=[]
5 for subs >newfields=newfields|A3.(A5+“_”+string(~))
6 >result=create(${“school,”+newfields.concat@c()})
7 for A4 >temp=[]
8 for subs >temp=temp|A7.align@a(A3,${B8}).(~.len())
9 >temp=[A7.school]|temp
10 >result.record(temp)
11 >mongo_close(A1)

A1、A2:查询原始数据到 A2 序表

A3、B3:准备动态的分数段和列名

A4:将原始数据按学校分组

imagepng

A5、B5、A6:生成结果表,结构是 school、sub1_5、sub1_4…sub1_1、sub2_5、sub2_4…sub2_1

A7:按学校循环,准备每次向 result 中插入一条数据(B10)

imagepng

B8、C8:根据 subs 循环,将 A4 中单个学校各科的学生成绩统计出来并按顺序拼接成序列。统计时将 A7 的某科目(B8)按照 A3 对齐分组

B9:拼接学校名称到上述序列,这样值可以和 result 序表的顺序对应上了

imagepng

B10:插入数据到结果表

imagepng

如果非动态列,科目较少,写法简化一点,逻辑与上面的方法大同小异:

A
1 =mongo_open(“mongodb://localhost:27017/local?user=test&password=test”)
2 =mongo_shell(A1,“student.find()”)
3 =A2.group(school)
4 =[5,4,3,2,1]
5 =A3.new(school,.align@a(A4,sub1).(.len()):sub1,.align@a(A4,sub2).(.len()):sub2)
6 =A5.new(school,.sub1(1):sub1_5,.sub1(2):sub1_4,.sub1(3):sub1_3,.sub1(4):sub1_2,.sub1(5):sub1_1,.sub2(1):sub2_5,.sub2(2):sub2_4,.sub2(3):sub2_3,.sub2(4):sub2_2,.sub2(5):sub2_1)
7 >mongo_close(A1)