Check if values are the same for each row over criteria in different other columns
问题
CREATETABLEcampaigns
(
idSERIALPRIMARYKEY,
campaignVARCHAR,
supplierVARCHAR,
plan_quantityDECIMAL
);
INSERTINTOcampaigns(campaign,supplier,plan_quantity)
VALUES('C001','supplier_a','500'),
('C001','supplier_a','500'),
('C001','supplier_b','500'),
('C002','supplier_a','600'),
('C002','supplier_b','700'),
('C003','supplier_c','100'),
('C003','supplier_c','100'),
('C004','supplier_a','900'),
('C004','supplier_c','800'),
('C004','supplier_d','250'),
('C004','supplier_d','250'),
('C005','supplier_b','380'),
('C005','supplier_b','270'),
('C005','supplier_d','590');
Expected result:
campaign|supplier|plan_quantity|check|
----------|--------------|-------------------|--------------------|-------
C001|supplier_a|500|same|
C001|supplier_a|500|same|
C001|supplier_b|500|non-relevant|
----------|--------------|-------------------|--------------------|-------
C002|supplier_a|600|non-relevant|
C002|supplier_b|700|non-relevant|
----------|--------------|-------------------|--------------------|-------
C003|supplier_c|100|same|
C003|supplier_c|100|same|
----------|--------------|-------------------|--------------------|-------
C004|supplier_a|900|non-relevant|
C004|supplier_c|800|non-relevant|
C004|supplier_d|250|same|
C004|supplier_d|250|same|
----------|--------------|-------------------|--------------------|-------
C005|supplier_b|380|different|
C005|supplier_b|270|different|
C005|supplier_d|590|non-relevant|
In case a supplier appears multiple times per campaign, I want to see in columncheck
if theplan_quantity
for this supplier is the same in every row.
I am getting close to the result when I modify the query fromthis question:
SELECT
campaignAScampaign,
supplierASsupplier,
plan_quantityASplan_quantity,
(CASE
WHENMIN(plan_quantity)OVER(PARTITIONBYsupplier,campaign)=MAX(plan_quantity)OVER(PARTITIONBYsupplier,campaign)
THEN'same'
ELSE'different'
END)AScheck
FROM
campaigns
ORDERBY
1,2,3;
However, I have no clue how I can add the description non-relevant to the query in case a supplier does not appear multiple times per campaign.
Do you have any idea?
解答
用自然思维去想这个问题,只需要对2、3、4列有序的数据添加值为non- relevant的check列,先按2、3列有序分组,如果分组后的组内记录数大于1,则check的值改为different,再对组内的结果再按第4列分组,如果分组后的组内记录数大于1,则check的值改为same。这类运算用SQL写起来很麻烦,要多层case语句配合多个窗口函数,写出来的SQL很不直观。通常的办法是读出来用Python或SPL来做, SPL(一种 Java 的开源包)更容易被Java应用集成,代码也更简单一点,只要两句:
A |
|
1 |
=PG.query("SELECT *,'non-relevant' as check FROM CAMPAIGNS ORDER BY 2,3,4") |
2 |
>A1.group@o(#2,#3).(if(~.len()>1,~.run(check="different"),~).group@o(#4).(if(~.len()>1,~.run(check="same")))) |
English version