文本与 JSON 做 JOIN 关联

【问题】

I have a tab-delimited textfile A (representing a BLAST output)

Name1BBBBBBBBBBBB 99.40 166  1 0 1 166  334  499  3e-82 302

Name2DDDDDDDDDDDD 98.80 167  2 0 1 167  346  512  4e-81 298

and a textfile B (representing a phylogenetic dendrogram) looking like

{

 "member": {

 "Cluster A": "BBBBBBBBBBBB This is Animal A",

 },

 "name": "Cluster A"

 },

  {

 "member": {

 "Cluster B": "DDDDDDDDDDDD This is Animal B"

 },

 "name": "cluster B"

 }

I want to take the string found in the 2nd tab of textfile A (DDDDDDDDDDDD for example) and look it up in text file B. The script should then add the info found in textfile B into a new tab of textfile A :

Name1BBBBBBBBBBBB 99.40 166  1 0 1 166  334  499  3e-82 302  Cluster A This is Animal A

Name2DDDDDDDDDDDD 98.80 167  2 0 1 167  346  512  4e-81 298  Cluster B This is A

【回答】

如果把源数据看做两张表,那你的问题可以用 SQL 中的 join 语句来解决,不过 perl 或 shell 没有直接提供这种功能,自行编写代码很复杂。可以考虑用集算器来简化,SPL 代码如下:



A

1

=json(file("json.txt").read())

2

=A1.new(#1.name:name,#1.(#1):cluster,(firstblank=pos(cluster," "),left(cluster,firstblank-1)):key,right(cluster,len(cluster)-firstblank):value)

3

=file("file.txt").import()

4

=join(A3,_2;A2,key).new(_1._1,_1._2,_1._3,_1._4,_1._5,_1._6,_1._7,_1._8,_2.name,_2.value)

A1:读取 json 文本

1png

A2:将 A1 中的记录做拆分,生成新的序表

2png

A3:读取 file.txt

3png

A4:对 A2 和 A3 做叉乘,获取最终结果

4png