JAVA 文件计算类库

【问题】

For my Java project I am looking for a convenient way to store my data. I have the following requirements:

1.It should be easy to synchronize with subversion (which I use for my Java code and other stuff). So I guess file-based is appropriate.

2.I want to be able to get certain elements without having to read all data into memory. Like in a database (“give me all objects with/without property x”, “give me all information about object with certain ID”).

3.I want to be able to read and write in this way.

I guess a database is overkill for my purpose, difficult to sync and I have to be admin/root on all machines to install it. (right?)

So I was thinking of using XML, but I heard that XML parsing in Java does not work very well. Or can anyone point me to a good library?

Then I was thinking of CSV. But all examples I saw (here and elsewhere) read the data into memory before processing it, which is not what I want.

I hope you can help me with this problem, because I am not so experienced with Java.

【回答】

需求有三点:用文件而不是数据库存储;方便进行结构化数据计算;文件较大,无法读入内存。

JAVA 本身没有直接提供这方面的类库,实现起来非常麻烦。可以试试用 SPL 协助 JAVA 实现这些需求,集算器提供了较完整的结构化数据处理类库,也方便嵌入到 JAVA 中。比如过滤文件可以简单写成:


A

1

=file("data.csv").cursor().select(BIRTHDAY>=date("1981-01-01")&&GENDER=="F")


详情及嵌入 JAVA 的方法可参考:

Java 如何调用 SPL 脚本

集算器协助 java 处理结构化文本