SPL 协助 Mongodb:Aggregation nested $group
Mongodb 数据库中某 collection(名为 type_properties) 数据如下:
[
{ type: "agency", properties: 4 },
{ type: "host", properties: 5 },
{ type: "agency", properties: 4 },
{ type: "landlord", properties: 5 },
{ type: "agency", properties: 8 },
{ type: "tenant", properties: 2 },
{ type: "host", properties: 1 },
{ type: "host", properties: 1 },
{ type: "agency", properties: 8 },
{ type: "host", properties: 9 },
{ type: "host", properties: 5 },
{ type: "agency", properties: 1 },
{ type: "agency", properties: 2 },
{ type: "tenant", properties: 2 },
{ type: "agency", properties: 1 },
{ type: "tenant", properties: 4 },
{ type: "tenant", properties: 7 }
]
要求:按 type 分组,组内统计 5 个指标,如下
count:组内记录数,以 tenant 为例是 4
“0 properties”: 组内 PROPERTIES=0 的记录数,本例是 0
“1 property”: 组内 PROPERTIES=1 的记录数,本例是 0
“2 properties”: 组内 PROPERTIES=2 的记录数,本例是 2
“3 or more properties”: 组内 PROPERTIES>=3 的记录数,本例是 2
期望结果如下:
[
{
type: "tenant",
count: 4, // THERE ARE 4 TENANTS TOTAL
"0 properties": 0, // 0 TENANTS WHERE PROPERTIES = 0
"1 property": 0, // 0 TENANTS WHERE PROPERTIES = 1
"2 properties": 2, // 2 TENANTS WHERE PROPERTIES = 2
"3 or more properties": 2 // 2 TENANTS WHERE PROPERTIES >= 3
},
{
type: "landlord",
count: 1,
"0 properties": 0,
"1 property": 0,
"2 properties": 0,
"3 or more properties": 1
},
{
type: "agency",
count: 7,
"0 properties": 0,
"1 property": 2,
"2 properties": 1,
"3 or more properties": 4
},
{
type: "host",
count: 5,
"0 properties": 0,
"1 property": 2,
"2 properties": 0,
"3 or more properties": 3
}
]
MongoDB query 冗长查询语句比较繁琐,一种较简洁的写法如下:
db.type_properties.aggregate([
{
$group: {
_id: "$type",
count: { $count: {} },
num_properties: { $push: "$properties" }
}
},
{
$set: {
"0 properties": {
$size: {
$filter: { input: "$num_properties", cond: { $eq: ["$$this", 0] } }
}
},
"1 property": {
$size: {
$filter: { input: "$num_properties", cond: { $eq: ["$$this", 1] } }
}
},
"2 properties": {
$size: {
$filter: { input: "$num_properties", cond: { $eq: ["$$this", 2] } }
}
},
"3 or more properties": {
$size: {
$filter: { input: "$num_properties", cond: { $gte: ["$$this", 3] } }
}
},
type: "$_id",
_id: "$$REMOVE",
num_properties: "$$REMOVE"
}
}
])
SPL 提供的 count 函数可以使用过滤条件为参数,因此可以一句分组代码实现:
A |
|
1 |
=mongo_open@d("mongodb://127.0.0.1:27017/local") |
2 |
=mongo_shell@d(A1, "{'find':'type_properties','projection':{'_id':0}}") |
3 |
=mongo_close(A1) |
4 |
=A2.groups(type;count(1):count,count(properties==0):'0 properties', count(properties==1):'1 properties', count(properties==2):'2 properties', count(properties>=3):'3 or more properties') |
5 |
=json(A4) |
前三行是读数,计算代码只有 A4 一行,A5 将结果转回 json 格式,其输出结果和原文要求一模一样。
英文版:https://c.esproc.com/article/1741439730467