对多个 CSV 文件做关联
【问题】
I have two CSV files as follows:
CSV1:
**ID Name Address Ph**
1 Mr.C dsf 142
2 Ms.N asd 251
4 Mr.V fgg 014
12 Ms.S trw 547
CSV2:
**ID Name Service Day**
1 Mr.C AAA Mon
2 Ms.N AAA Mon
2 Ms.N BBB Tue
2 Ms.N AAA Sat
As you can see very quickly CSV1 file is unique in having only 1 instance of every ID whilst CSV2 has repeats.
I am trying to match two CSV files based on ID and then wherever they match adding to CSV2 file the Address and Ph fields from CSV1. This is then saved as a new output file leaving the two original CSV files intact.
I have written a code but here’s what’s happening:
-
Either all the entries from CSV1 get added against the last row of CSV2
-
Or all the entries from CSV2 get the same address details appended against them
Here’s what I have done so far.
import csv
csv1=open('C:\csv1file.csv')
csv2=open('C:\csv2file.csv')
csv1reader=csv.reader(csv1)
csv2reader=csv.reader(csv2)
outputfile=open('C:\mapped.csv', 'wb')
csvwriter=csv.writer(outputfile)
counter=0
header1=csv1reader.next()
header2=csv2reader.next()
csvwriter.writerow(header2+header1[2:4])
for row1 in csv1reader:
for row2 in csv2reader:
if row1[0]==row2[0]:
counter=counter+1
csvwriter.writerow(row2+row1[2:4])
I am running this code in Python 2.7. As you might have guessed the two different results that I am getting are based on the indentation of the csvwriter statement in the above code. I feel I am quite close to the answer and understand the logic but somehow the loop doesn’t loop very well.
Can any one of you please assist?
Thanks.
别人给出的解法,有 bug 未解决
csvwriter.writerow(header2+header1[2:4])
csv2copy=[]
for row2 in csv2reader: csv2copy.append(row2)
for row1 in csv1reader:
for row2 in csv2copy:
print row1,row2,counter
if row1[0]==row2[0]:
counter=counter+1
csvwriter.writerow(row2+row1[2:4])
【回答】
这是典型的左连接,python 要循环实现,比较麻烦。可以用 SPL 实现,代码简单易懂:
A |
|
1 |
=file("D:\\csv1.csv").import@t() |
2 |
=file("D:\\csv2.csv").import@t() |
3 |
>A2.switch(ID,A1:ID) |
4 |
=A2.new(ID.ID,Name,Service,Day,ID.Address,ID.Ph) |
5 |
=file("D:\\result.txt").export@t(A4) |
A3:将 A2 中的 ID 切换为指引字段
A4:按关联记录获取最终结果