文本读入拆分

【问题】

I have data in a csv file which looks like this:

fromaddress, toaddress, timestamp
sender1@email.com, recipient1@email.com, recipient2@email.com, 8-1-2015
sender2@email.com, recipient1@email.com, 8-2-2015
sender3@email.com, recipient1@email.com, recipient2@email.com, recipient3@email.com, recipient4@email.com, 8-3-2015
sender1@email.com, recipient1@email.com, recipient2@email.com, recipient3@email.com, 8-4-2015

Using Python, I would like to produce a txt file that looks like:

sender1_email.com, recipient1_email.com
sender1_email.com, recipient2_email.com
sender2_email.com, recipient1_email.com
sender3_email.com, recipient1_email.com
sender3_email.com, recipient2_email.com
sender3_email.com, recipient3_email.com
sender3_email.com, recipient4_email.com
sender1_email.com, recipient1_email.com
sender1_email.com, recipient2_email.com
sender1_email.com, recipient3_email.com

Ultimately, I imagine this whole process will take several steps. After reading in the csv file, I will need to create separate lists for fromaddress and toaddress (I am ignoring the timestamp column altogether). There is only 1 email address per row in the fromaddress column, however there are any number of email addresses per row in the toaddress column. I need to duplicate the fromaddress email address for each toaddress email address listed for each row. Once this done I need to replace all of the @ symbols with underscore (_) symbols. Finally, when I write the txt file, I need to add an extra space between each row so that it is "double-spaced"

I have not gotten very far as I'm a Python newbie and I'm stuck on the first step. The following code is duplicating the fromaddress for each individual character in the toaddress column instead of each individual email address. I also need help with the toaddress list as well. Can anyone help?

import csv
fromaddress = []
toaddress = []

with open("filename.csv", 'r') as f:
    c = csv.reader(f, delimiter = ",")
    for row in c:
        for item in row[1]:
            fromaddress.append(row[0]);

print(fromaddress)

Everyone, thanks for all of your help! I tried all your code but unfortunately I'm not getting the output I need. Instead of getting this (what I want):

sender1_email.com, recipient1_email.com
sender1_email.com, recipient2_email.com
sender1_email.com, recipient3_email.com
sender2_email.com, recipient1_email.com
sender3_email.com, recipient1_email.com
sender3_email.com, recipient2_email.com

I'm getting this:

sender1_email.com,"recipient1_email.com, recipient2_email.com, recipient3_email.com"
sender2_email.com,"recipient1_email.com"
sender3_email.com,"recipient1_email.com, recipient2_email.com"

There is only 1 element in each "fromaddress" row, but there are multiple elements in each "toaddress" row. Basically, I have to pair each recipient address with the correct sender address. I think I'm not getting the right output because of the (") double quotation marks in the csv file to surround all of the sender addresses in each row.

【回答】

       取第2到第N行的数据,将每行第1个成员作为第1列,将第2到倒数第2个成员转为第2列,拼为多行二维表,把字符串中的"@"替换成"_"

   这里集合运算较多,用python实现有些麻烦,而使用SPL更简单:


A

1

=file("d:\\input.csv").read@n().(replace(~,"@","_"))

2

=A1.to(2,).(~.array())

3

=A2.news(~.to(2,~.len()-1);A2.~(1),~)

4

=file("d:\\result.txt").export@c(A3)

 

A1读取 csv文件的内容,将每一行拼成字符串作为一个序列成员并将字符串中的"@"替换成"_"

A2:从序列A1中取第2个到最后一个成员组成新的序列,再把每一个序列中的成员拆分成序列,最后返回序列的序列。通过这一步获取并处理数据部分。

undefined

 A3:获取A2每个序列成员中第2个到倒数第二个成员,并拆分成两列,最后返回成新序表。

undefined

A4:将A3结果导出到以逗号分隔的文本文件。