Java,统计大 csv 文件中每组的数量
大 csv 文件 data.csv 超出内存,第 3 列是分组列。
Date,Time,Sub User,Access Method 10-10-2023,00:03:06,JL,cli 10-10-2023,00:02:20,TW2JL,app 10-10-2023,00:03:26,JL,cli 10-10-2023,00:03:34,JL,cli 10-10-2023,00:03:35,JL,cli 10-10-2023,00:03:46,JL,cli 10-10-2023,00:04:09,JL,cli 10-10-2023,00:04:51,JL,cli 10-10-2023,00:04:56,JL,cli 10-10-2023,00:05:58,JL,cli 10-10-2023,00:06:29,JL,cli 10-10-2023,00:06:42,JL,cli 10-10-2023,00:26:35,TW2JL,app 10-10-2023,00:30:01,TW2JL,app 10-10-2023,00:30:02,TW2JL,app 10-10-2023,00:30:05,TW2JL,app 10-10-2023,00:33:42,TW2JL,app 10-10-2023,00:36:36,TW2JL,app 10-10-2023,00:45:10,TW2JL,app 10-10-2023,00:53:01,TW2JL,app 10-10-2023,00:53:24,TW2JL,app 10-10-2023,01:03:14,TW2JL,app 10-10-2023,01:03:18,TW2JL,app 10-10-2023,01:03:20,TW2JL,app |
要求:用 Java 对第 3 列分组,统计每组记录数。
Sub User |
cnt |
JL |
11 |
TW2JL |
13 |
编写SPL语句:
=T@c(""data.csv"").groups("'Sub User"';count(1):cnt)
函数T用于解析csv文件,@c表示大文件游标模式。函数groups用于分组汇总。
Java 集成 SPL 可参考 Java 如何调用 SPL 脚本
问题来源:https://stackoverflow.com/questions/77281563/reading-a-csv-file-to-count-the-number-of-sub-users-based-on-the-access-method-i
英文版 https://c.scudata.com/article/1724398159966