结构化文本查询

【问题】

I'm trying to check whether column 3 of a tab-delimited file contains a certain word. If it does not, it should continue reading. If it does contain the word, it should check column 4. Depending on whether there is content in column 4, the output should be something found or something not found.

I'm not stuck on the second part of this, i.e. checking column 4. My output gives me"something found" when there is in fact no content there.

for line in f:

    if line.strip()split("\t")[2] == "word":

        print ("word")

        if line.strip().split("\t")[3] is not None:

            print ("something found")

        else:

            print("nothing found")

The file looks like this:

reference #1 reference #2 notword content ...(more columns)
reference #1 reference #2 word content ...
reference #1 reference #2 word noContent ...

 

【回答】

       这个问题用最简单的结构化查询就能实现,Python代码显然有些复杂了。这种情况下使用SPL更合适:


A

1

=file("d:\\data.csv").import()

2

=A1.select(_3:"word",   _4:"content")

A1:读取文本

A2:选出第三列值为word,第四列值为content的记录

同理,通过在select函数里编辑不同的查询条件,可实现各种条件的结构化查询,如果查询条件是动态的,那么也可以通过传参和宏替换的方式实现,比如上例中我们修改一下:

 

undefined

 


A

1

=file("d:\\data.csv").import()

2

=A1.select(${where})

A2:通过参数where,给select函数传递不同的查询条件实现动态查询

写好的脚本如何在应用程序中调用,可以参考Java 如何调用 SPL 脚本