大文本查询
【问题】
I have a text file of size 13 GB. Each new line of the file contains a table like row where vales are separated by a comma. My problem is that I have to search for a certain entry in this file. But because of the file size the normal file operations are not responding. I am using SplFileObject Class of PHP to deal with file operations and arrays to store the tokenized value of each row and then performing comparison in each iteration for the new line
So can anyone one suggest how should I proceed in terms of data Structure or programming methodology to get better methodolog
【回答】
导入数据库这种方法能暂时解决问题,但如果数据经常变化,那这种方法就不适用了。因为导入数据的过程会非常缓慢(要处理数据一致性等问题)。这种情况用 SPL 就很容易实现,可直接查询文本文件,代码如下:
A |
|
1 |
=file("D:/employee.txt").cursor@tc() |
2 |
=A1.select(BIRTHDAY>=date("1981-01-01") && GENDER=="F") |
3 |
=A2.fetch() |
A2 中的查询条件是:1981 年 1 月 1 日(含)之后出生的女员工。这个条件可以根据需要重新改写,还可以将查询条件写在参数里,实现动态查询。如果查询结果较多,可以将 A3 改为 file(“D:/result.txt”).export(A2),这可以将计算结果直接输出到文件中。
当然除了上面这种写法外,集算器里还能直接写 SQL 查询 txt 文件,如:
A |
|
1 |
$select * from test.txt where BIRTHDAY>=date('1981-01-01') and GENDER=’F’ |