分组汇总运算
【问题】
I have an ArrayList that is created from an input CSV file, in which some calculations will be performed to generate more columns which is to be afterwards printed into another CSV file. In the ArrayList, one of the attributes is a Unix time stamp. The record comprises of about 7 different days. What I want to do is to group the records by day, then if it's not in order, order the groups by time (the specifics, i.e. hours, minutes, seconds). So, from my input CSV file, I extracted the Unix timestamp using a delimiter, e.g. 1442327884, then I used this code to retrieve the day
java.util.Date time = new java.util.Date((long) timeStamp * 1000);
// gives a result of Tue Sep 15 22:38:04 SGT 2015
String date = String.valueOf(time.getDate());;
// gives the result of "15"
A method I used for another calculation for grouping by is as follows
Map<String, List<String>> groups = data.stream().collect(Collectors.groupingBy(e -> e.split(",")[1]));
How do I set it to groupBy String date as mentioned above?
【回答】
分组汇总是典型结构化计算,JAVA本身缺少这方面的通行类库,硬编码非常麻烦。建议用SPL来辅助实现,代码简单易懂,也能方便地集成进Java(参考Java 如何调用 SPL 脚本)。
本例没有给出具体数据,假如示例数据如下:
id time_stamp
1 1442327884
2 1442337884
3 1442347884
4 1442338884
SPL代码如下:
A |
|
1 |
=file("d:\\source.csv").import@ct() |
2 |
=A1.run(time_stamp=date(time_stamp)) |
3 |
=A2.group(time_stamp) |
4 |
=A3.conj() |
5 |
=file("d:\\result.csv").export@tc(A4) |
A1:读取source.csv中的数据
A2:将Unix时间戳转换成日期类型
A3:将转换后的日期分组排序
A4:合并分组中的数据
A5:将结果导出到result.csv