对两个 CSV 文件按记录序号关联
【问题】
I want to join or you can say union two csv file in other output .csv file, e.g. let say I have a csv file with fields a,b,c,d,e another with fields a,b,c,x,y,z I need output as in output file - a,b,c,d,e,x,y,z
I did as under but not getting desired output,Please let me know working java code
int n1 = 3,n2 = 3;//stores the serial number of the column that has the duplicate data
BufferedReader br1=new BufferedReader(new InputStreamReader(new FileInputStream("c:/test/report1.csv")));
BufferedReader br2=new BufferedReader(new InputStreamReader(new FileInputStream("c:/test/report2.csv")));
String line1,line2;
while((line1=br1.readLine())!=null && (line2=br2.readLine())!=null){
String line=line1+","+line2;
String newL="";
StringTokenizer st=new StringTokenizer(line,",");
System.out.println("total tokens "+st.countTokens());
int i = 3;
for(int i=1;i<=st.countTokens();i++){
if((i==n1)||(i==n1+n2))
continue;
else
newL=newL+","+st.nextToken();
}
String l=newL.substring(1);
System.out.println("merged "+l);
//write this line to the output file
}
【回答】
以记录序号为标准合并结构化文件,JAVA 缺乏结构化计算类库,实现起来难免困难。这种情况下可用集算器辅助 JAVA 实现,SPL 代码如下:
A |
|
1 |
=file("d:\\report1.csv").import@ct() |
2 |
=file("d:\\report2.csv").import@ct() |
3 |
=join@p(A1;A2).new(_1.a ,_1.b ,_1.c ,_1.d ,_1.e ,_2.x ,_2.y ,_2.z) |
4 |
=file("D:\\result.csv").export@t(A3;",") |
A1-A2:分别读取 CSV 数据
A3:按记录序号合并文件,取出所需字段。
A4:将结果输出到 result.csv 中。
除了按序号合并,更常见的是按照某几个字段进行合并(类似 SQL 中的 join 语句),请参考【结构化文本计算示例】。
集算器不仅可以进行关联计算,还可以通过 JDBC 与 JAVA 集成,参考《Java 如何调用 SPL 脚本》。