SPL 协助 Mongodb:Aggregation nested $group

Mongodb 数据库中某 collection(名为 type_properties) 数据如下:

[
    { type: "agency", properties: 4 },
    { type: "host", properties: 5 },
    { type: "agency", properties: 4 },
    { type: "landlord", properties: 5 },
    { type: "agency", properties: 8 },
    { type: "tenant", properties: 2 },
    { type: "host", properties: 1 },
    { type: "host", properties: 1 },
    { type: "agency", properties: 8 },
    { type: "host", properties: 9 },
    { type: "host", properties: 5 },
    { type: "agency", properties: 1 },
    { type: "agency", properties: 2 },
    { type: "tenant", properties: 2 },
    { type: "agency", properties: 1 },
    { type: "tenant", properties: 4 },
    { type: "tenant", properties: 7 }
]

要求:按 type 分组,组内统计 5 个指标,如下

count:组内记录数,以 tenant 为例是 4
“0 properties”: 组内 PROPERTIES=0 的记录数,本例是 0
“1 property”: 组内 PROPERTIES=1 的记录数,本例是 0
“2 properties”: 组内 PROPERTIES=2 的记录数,本例是 2
“3 or more properties”: 组内 PROPERTIES>=3 的记录数,本例是 2

期望结果如下:

[
    {
        type: "tenant",
        count: 4, 		// THERE ARE 4 TENANTS TOTAL
        "0 properties": 0, 	// 0 TENANTS WHERE PROPERTIES = 0
        "1 property": 0, 	// 0 TENANTS WHERE PROPERTIES = 1
        "2 properties": 2, 	// 2 TENANTS WHERE PROPERTIES = 2
        "3 or more properties": 2 	// 2 TENANTS WHERE PROPERTIES >= 3
    },
    {
        type: "landlord",
        count: 1, 
        "0 properties": 0,
        "1 property": 0,
        "2 properties": 0,
        "3 or more properties": 1
    },
    {
        type: "agency",
        count: 7, 
        "0 properties": 0,
        "1 property": 2,
        "2 properties": 1,
        "3 or more properties": 4
    },
    {
        type: "host",
        count: 5, 
        "0 properties": 0,
        "1 property": 2,
        "2 properties": 0,
        "3 or more properties": 3
    }
]

MongoDB query 冗长查询语句比较繁琐,一种较简洁的写法如下:

db.type_properties.aggregate([
  {
    $group: {
      _id: "$type",
      count: { $count: {} },
      num_properties: { $push: "$properties" }
    }
  },
  {
    $set: {
      "0 properties": {
        $size: {
          $filter: { input: "$num_properties", cond: { $eq: ["$$this", 0] } }
        }
      },
      "1 property": {
        $size: {
          $filter: { input: "$num_properties", cond: { $eq: ["$$this", 1] } }
        }
      },
      "2 properties": {
        $size: {
          $filter: { input: "$num_properties", cond: { $eq: ["$$this", 2] } }
        }
      },
      "3 or more properties": {
        $size: {
          $filter: { input: "$num_properties", cond: { $gte: ["$$this", 3] } }
        }
      },
      type: "$_id",
      _id: "$$REMOVE",
      num_properties: "$$REMOVE"
    }
  }
])

SPL 提供的 count 函数可以使用过滤条件为参数,因此可以一句分组代码实现:



A

1

=mongo_open@d("mongodb://127.0.0.1:27017/local")

2

=mongo_shell@d(A1, "{'find':'type_properties','projection':{'_id':0}}")

3

=mongo_close(A1)

4

=A2.groups(type;count(1):count,count(properties==0):'0 properties', count(properties==1):'1 properties', count(properties==2):'2 properties', count(properties>=3):'3 or more properties')

5

=json(A4)

前三行是读数,计算代码只有 A4 一行,A5 将结果转回 json 格式,其输出结果和原文要求一模一样。


问题来源:https://stackoverflow.com/questions/78478388/mongodb-aggregation-nested-group