How to calculate some specific data function from the data of a large CSV file
问题
I'm trying to work out the most expensive county to rent a building from data in a CSV file. The data from each column I need the data from has been put into a list. The price range is set by the user so the outer most for loop and if statement ensure that the buildings considered are in the set price range.
The price of a building is also slightly complicated because the price is the minimum stay x price.
In the code below I am trying to get the average property value of one county just son I can get the basic structure right before I carry on, but I'm kind of lost at this point any help would be much appreciated.
publicintsampleMethod()
{
ArrayList<String>county=newArrayList<String>();
ArrayList<Integer>costOfBuildings=newArrayList<Integer>();
ArrayList<Integer>minimumStay=newArrayList<Integer>();
ArrayList<Integer>minimumBuildingCost=newArrayList<Integer>();
try{
//CodetoreaddatafromtheCSVandputthedatainthelists.
}
}
catch(IOException|URISyntaxExceptione){
//Somecode.
}
intcount=0;
intavgCountyPrice=0;
intcountyCount=0;
for(intcost:costOfBuildings){
if(costOfBuildings.get(count)>=controller.getMin()&&costOfBuildings.get(count)<=controller.getMax()){
for(StringcurrentCounty:county){
for(intcurrentMinimumStay:minimumStay){
if(currentCounty.equals("samplecounty")){
countyCount++;
inttemp=nightsPermitted*cost;
avgCountyPrice=avgCountyPrice+temp/countyCount;
}
}
}
}
count++;
}
returnavgCountyPrice;
}
Here is a sample table to depict what the CSV looks like, also the CSV file has more than 50,000 rows.
name |
county |
price |
minStay |
Morgan |
lydney |
135 |
5 |
John |
sedury |
34 |
1 |
Patrick |
newport |
9901 |
7 |
解答
这个问题需要对csv 文件中的数据按 county 分组计算 price 平均值,再找出 price 平均值最高的 county,Java 实现则代码较长。
用Java 下的开源包 SPL 很容易写,只要 1 句:
A |
|
1 |
=file("data.csv").import@ct().groups(county;avg(price):price_avg).top(-1;price_avg).county |
SPL 提供了 JDBC 供 Java 调用,把上面的脚本存为 mostExpensiveCounty.splx,在 Java 中以存储过程的方式调用脚本文件:
…
Class.forName("com.esproc.jdbc.InternalDriver");
con= DriverManager.getConnection("jdbc:esproc:local://");
st = con.prepareCall("call mostExpensiveCounty()");
st.execute();
…
English version