MongoDB 如何分组后获取每组的最新记录
在统计应用项目中,经常遇到分组后,想了解每组中的日期最新记录的状态。
如有集合 project, 想了解executionProject每组日期最新的 suiteStatus,数据如下:
{ "_id" : ObjectId("55d4410544c96d6f6578f893"), "executionProject" : "Project1", "suiteList" : [ { "suiteStatus" : "PASS", } ], "runEndTime" : ISODate("2015-08-19T08:40:47.049Z"), "runStartTime" : ISODate("2015-08-19T08:40:37.621Z"), "runStatus" : "PASS", "__v" : 1 }, { "_id" : ObjectId("55d44eb4c0422e7b8bffe76b"), "executionProject" : "Project1", "suiteList" : [ { "suiteStatus" : "PASS", } ], "runEndTime" : ISODate("2015-08-19T09:39:13.528Z"), "runStartTime" : ISODate("2015-08-19T09:39:00.406Z"), "runStatus" : "PASS", "__v" : 1 }, { "_id" : ObjectId("55d44f0bc0422e7b8bffe76f"), "executionProject" : "Project1", "suiteList" : [ { "suiteStatus" : "FAIL", } ], "runEndTime" : ISODate("2015-08-19T09:46:31.108Z"), |
"runStartTime" : ISODate("2015-08-19T09:40:27.377Z"), "runStatus" : "PASS", "__v" : 1 }, { "_id" : ObjectId("55d463d0c0422e7b8bffe789"), "executionProject" : "Project2", "suiteList" : [ { "suiteStatus" : "FAIL" }, { "suiteStatus" : "PASS" } ], "runEndTime" : ISODate("2015-08-19T11:09:52.537Z"), "runStartTime" : ISODate("2015-08-19T11:09:04.539Z"), "runStatus" : "FAIL", "__v" : 1 }, { "_id" : ObjectId("55d464ebc0422e7b8bffe7c2"), "executionProject" : "Project3", "suiteList" : [ { "suiteStatus" : "FAIL" } ], "runEndTime" : ISODate("2015-08-19T11:18:41.460Z"), "runStartTime" : ISODate("2015-08-19T11:13:47.268Z"), "runStatus" : "FAIL", "__v" : 10 } |
希望输出如下:
[ { "executionProject": "Project1", "suite-pass": 0, "suite-fail": 1, "runEndTime": ISODate("2015-08-19T09:46:31.108Z") }, { "executionProject": "Project2", "suite-pass": 1, "suite-fail": 1, "runEndTime": ISODate("2015-08-19T11:09:52.537Z") }, { "executionProject": "Project3", "suite-pass": 0, "suite-fail": 1, "runEndTime": ISODate("2015-08-19T11:18:41.460Z") }] |
$map+in+$eq 组合分别对 pass, fail 状态处理,最后与最新日期的 runEndTime 合并输出,实现比较复杂。
使用集算器, 分组时获取每组最新记录,再对最新记录处理,实现比较容易。
集算器安装包可去润乾网站下载,运行时需要一个授权,免费版本就够用。
我们将上述事例实现步骤:
1、在集算器中编写脚本 project.dfx:
A | B | |
1 | =mongo_open("mongodb://127.0.0.1:27017/raqdb") | / 连接 MongDB 数据库 |
2 | =mongo_shell(A1,"project.find(,{_id:0})").fetch() | / 获取集合 project 数据 |
3 | =A2.group(executionProject; (r=~.maxp(runEndTime)).suiteList, r.runEndTime) | / 分组取日期最新记录 |
4 | =A3.new(executionProject,if(suiteList.(suiteStatus).pos("PASS"), 1, 0): suite-pass, if(suiteList.(suiteStatus).pos("FAIL"),1,0): suite-fail, runEndTime) | / 获取最新记录的状态 |
5 | >A1.close() | / 关闭连接 |
A3 | executionProject | suiteList | runEndTime |
Project1 | [[FAIL]] | 2015-08-19 17:46:31 | |
Project2 | [[FAIL],[PASS]] | 2015-08-19 19:09:52 | |
Project3 | [[FAIL]] | 2015-08-19 19:18:41 |
A4 | executionProject | suite-pass | suite-fail | runEndTime |
Project1 | 0 | 1 | 2015-08-19 17:46:31 | |
Project2 | 1 | 1 | 2015-08-19 19:09:52 | |
Project3 | 0 | 1 | 2015-08-19 19:18:41 |
A3 分组获取每组的日期最新记录,将需要的字段提取出来。
集算器提供了 JDBC 接口,脚本 project.dfx 很容易集成到 Java 中:
public static void doWork() {
Connection con = null;
java.sql.Statement st;
try{
Class.forName("com.esproc.jdbc.InternalDriver");
con = DriverManager.getConnection("jdbc:esproc:local://");
// 调用脚本 project.dfx
st=con.createStatement();
ResultSet rst = st.executeQuery("call project");
System.out.println(rst);
}catch(Exception e){
System.out.println(e);
}finally{
// 关闭连接
if (con!=null) {
try {
con.close();
}catch(Exception e) {
System.out.println(e);
}
}
}
}
分组后根据某一字段获取每组的最大值或最小值,也可以进行类似处理。
集算器与 JAVA 集成的进一步信息可参考:《Java 如何调用 SPL 脚本》。
英文版