csv 过滤及汇总计算

【问题】

In Bookings.csv file each line contains a name, surname, roomtype, check-in date, check-out date separated by semicolons.

Name;Surname;roomtype1.2;2015-03-24;2015-03-26  
Paul;Smith;roomtype1.1;2015-03-21;2015-03-23  
Romas;Babajus;roomtype2.1;2015-03-26;2015-03-28  
Bob;Alfredo;roomtype3.1;2015-03-24;2015-03-26  
Edvard;Jogn;roomtype2.2.;2015-03-04;2015-03-25  
Jonas;Amberto;roomtype3.2;2015-03-20;2015-03-23

in roomtype1.* , roomtype2.* , roomtype3.* the “*” indicates a room that is considered to be roomtype1.

When there’s a new booking for roomtype1 the program should find the check-out dates of rooms which are roomtype1 (roomtype1.1 and roomtype1.2) and compare each check-out date in order find a room that has the closest check-out date to the new booking date.

So far a was only able to read the whole dates stored in Bookings.csv without knowing to which roomtype those dates belong.

How would you suggest to read only roomtype1 check-out dates from a csv file? Would it be the best way to use a two-dimensional array and loop the file?

So far my code looks like this if that helps.


publicclassBookings {

staticlong difv;

publicstaticvoid main(String\[\] args) throws Exception {

SimpleDateFormatft = newSimpleDateFormat("yyyy-MM-dd");

DatecheckIn = null;

DatecheckOut = null;

Date test = ft.parse("2015-03-30");

StringfileName = "Bookings.csv";

Filefile = newFile(fileName);

try {

ScannerinputStream = newScanner(file);

while (inputStream.hasNext()) {

String data = inputStream.next();

String\[\] values = data.split(";");

            checkIn = ft.parse(values\[3\]);

            checkOut = ft.parse(values\[4\]);            

//                System.out.println("Check in date");

//                System.out.println(checkIn);

//                System.out.println("Check out date");

//                System.out.println(checkOut);

            }

            inputStream.close();

//interval(checkOut, test, TimeUnit.HOURS);

// System.out.println("the difv is " + difv);

//            if (checkOut.compareTo(test) <= 0) { // or equal

//                System.out.println("Date1 is after or equal to Date2");

//            } else if (checkOut.compareTo(test) < 0) {

//                System.out.println("Date1 is before Date2");

//            } else if (checkOut.compareTo(test) == 0) {

//               System.out.println("Date1 is equal to Date2");

//            } else {

//                System.out.println("How to get here?");

//                

//            }

} catch (FileNotFoundException e) {

            e.printStackTrace();

        }

    }

publicstaticlong interval(DatecheckOut, Datetest, TimeUnit timeunit) {

long diff = test.getTime() - checkOut.getTime();

        difv = timeunit.convert(diff,TimeUnit.MILLISECONDS);

return difv;

        }

}

Thank you in advance for your help!

【回答】

本问题需要进行字段查询、求最大值,这些都是结构化数据的基本运算,但 JAVA 缺乏相关的类库,实现过程复杂,代码可读性差。这种情况下可以用 SPL 辅助实现,代码更直观易懂:



A

1

=file("Bookings.csv").import@tc(;,";")

2

=A1.select(like(roomtype, argtype+".*"))

3

=A2.maxp(interval('check-out',argDate))


imagepng

A1: 读入文件,分割符是分号,将第一行读为字段名。

A2: 查询 roomtype 字段,argtype 是查询参数,* 是通配符。

A3: 计算最大值,排序规则是 check-out 字段和输入参数 argDate 的间隔。

集算器提供了易于集成的 JDBC 接口,结合集算器可以大大简化 JAVA 中的结构化计算,详细可参考:

Java 如何调用 SPL 脚本

集算器协助 java 处理结构化文本