如何对两个 Excel 表实现各种类型的 JOIN
某Excel中的sheet分为3类,其中sheet A是基础表,部分数据如下:
A | B | C | |
1 | interval1 | interval2 | interval3 |
2 | 1 hour | 1 day | 1 week |
3 | 2 hours | 2 days | 2 weeks |
4 | 3 hours | 3 days | 3 weeks |
5 | 4 hours | 4 days | 4 weeks |
Sheet B1\B2…Bn是关联表,它们的格式都一样,且与A有相同的列interval1、interval2、interval3。其中一个B的部分数据如下:
A | B | C | D | E | F | G | |
1 | interval1 | interval2 | interval3 | Type | value1 | value2 | value3 |
2 | 2 hours | 1 day | 7 week | circle | 37 | 108.1 | 4.1 |
3 | 3 hours | 3 days | 7 weeks | Line | 39 | 117.5 | 4.2 |
4 | 4 hours | 4 days | 7 weeks | Line | 35 | 127 | 4.3 |
Sheet C用来描述A与B1\B2..Bn的Join类型,共3种,其中cross Join表示笛卡尔积;leftJoinBig表示左关联,关联列是interval1;leftJoinSmall也是左关联,关联列是interval1、interval2。部分数据如下:
A | B | |
1 | table | joinType |
2 | B1 | crossJoin |
3 | B2 | leftJoinBig |
4 | B3 | leftJoinSmall |
计算目标:按照sheet C中的join类型将sheet A和B1\B2..Bn关联起来,从A中取interva1列,从B取其他列,最后形成n个Excel文件。
以上面的sheet B为例(实际上每个B应当不同),如果joinType==crossJoin,则关联结果应该是:
A | B | C | D | E | F | G | |
1 | interval1 | interval2 | interval3 | Type | value1 | value2 | value3 |
2 | 1 hour | 1 day | 7 week | Circle | 37 | 108.1 | 4.1 |
3 | 1 hour | 3 days | 7 weeks | Line | 39 | 117.5 | 4.2 |
4 | 1 hour | 4 days | 7 weeks | Line | 35 | 127 | 4.3 |
5 | 2 hours | 1 day | 7 week | circle | 37 | 108.1 | 4.1 |
6 | 2 hours | 3 days | 7 weeks | Line | 39 | 117.5 | 4.2 |
7 | 2 hours | 4 days | 7 weeks | Line | 35 | 127 | 4.3 |
8 | 3 hours | 1 day | 7 week | circle | 37 | 108.1 | 4.1 |
9 | 3 hours | 3 days | 7 weeks | Line | 39 | 117.5 | 4.2 |
10 | 3 hours | 4 days | 7 weeks | Line | 35 | 127 | 4.3 |
11 | 4 hours | 1 day | 7 week | circle | 37 | 108.1 | 4.1 |
12 | 4 hours | 3 days | 7 weeks | Line | 39 | 117.5 | 4.2 |
13 | 4 hours | 4 days | 7 weeks | Line | 35 | 127 | 4.3 |
如果joinType==leftJoinBig,则关联结果应该是:
A | B | C | D | E | F | G | |
1 | interval1 | interval2 | interval3 | Type | value1 | value2 | value3 |
2 | 1 hour | ||||||
3 | 2 hours | 1 day | 7 week | circle | 37 | 108.1 | 4.1 |
4 | 3 hours | 3 days | 7 weeks | line | 39 | 117.5 | 4.2 |
5 | 4 hours | 4 days | 7 weeks | line | 35 | 127 | 4.3 |
如果joinType==leftJoinSmall,则关联结果应该是:
A | B | C | D | E | F | G | |
1 | interval1 | interval2 | interval3 | type | value1 | value2 | value3 |
2 | 1 hour | ||||||
3 | 2 hours | ||||||
4 | 3 hours | 3 days | 7 weeks | line | 39 | 117.5 | 4.2 |
5 | 4 hours | 4 days | 7 weeks | line | 35 | 127 | 4.3 |
这个计算需要循环遍历sheet C,因此只能用脚本而不是公式来实现。Join属于结构化计算,VBA缺乏直接可用的函数,代码会非常繁琐。
实现步骤:
1. 运行集算器(可以到润乾官网下载,用职场版,首次运行时会提示加载授权,下载个免费的就够了)
2. 编写脚本并执行
A | B | |
1 | =file("data.xlsx").xlsopen() | |
2 | =A1.xlsimport@t(;"C") | |
3 | =tableA=A1.xlsimport@t(;"A") | |
4 | for A2 | =tableB=A1.xlsimport@t(;A4.table) |
5 | =case(A4.joinType, "crossJoin",xjoin(tableA:A;tableB:B), "leftJoinBig",xjoin@1 (tableA:A;tableB:B,A.interval1==interval1), "leftJoinSmall",xjoin@1(tableA:A;tableB:B,A.interval1==interval1 && A.interval2==interval2)) |
|
6 | =B5.new(A.interval1,B.interval2,B.interval3,B.type,B.value1,B.value2,B.value3) | |
7 | =file(A4.table+A4.joinType+".xlsx").xlsexport@t(B6) |
脚本函数 case 可对 Join 类型做分支判断,xjoin 算出笛卡尔积,@1 表示左关联。