PROC SQL - How group duplicate rows

Reply
Contributor
Posts: 34

PROC SQL - How group duplicate rows

[ Edited ]
 

Hi,

Would anyone know how to, by using PROC SQL merge duplicate variables rows and this would result in a sum up of one column?

tried SELECT DISTINCT and GROUP BY but I am still receiving multiple duplicates. 

 

Get duplicate rows showing for each Product #... don't want to use  PROC SORT nodupkey -> it will remove all but 1 and the amount for Jan will not be correct... 

 

I want the Product #  and code to be distinct  - if there are amounts in both months then it should be one line.

 

 

please HELP  - see code below

 

proc sql; 
create table TRANSPOSE_VIEW as 

select distinct
Product,    
ADJ_REASON_CODE,    
CODE,    
ADJ_ACTIVITY1_CATG,   
gl_DESC1,   
ADJ_ACTIVITY2_CATG,
case repMTD_CD when 'CM' then repMTH end as CM,
case repMTD_CD when 'CM' then SAP_AMOUNT end as JAN_AMOUNT,
case repMTD_CD when 'PM' then repMTH end as PM,
case repMTD_CD when 'PM' then SAP_AMOUNT end as FRB_AMOUNT
 

from CATSegmentRepAll

group by
Product, 
ADJ_REASON_CODE, 
CODE,
ADJ_ACTIVITY1_CATG,
ADJ_ACTIVITY2_CATG
;QUIT;

this is what I am getting 

Product #ADJ_REASON_CODECODEADJ_ACTIVITY1_CATGgl_DESC1ADJ_ACTIVITY2_CATGCMAPRIL_AMOUNTPMMARCH_AMOUNT
50000683DISCOASBACT60PricingContraActivation/Port-in Credits  2018-0360
50000683DISCOASBACT60PricingContraActivation/Port-in Credits2018-0460  
50001288DISCOASBACT60PricingContraActivation/Port-in Credits  2018-0360
50002333DISCOASBACT60PricingContraActivation/Port-in Credits  2018-0360
50002333DISCOASBACT60PricingContraActivation/Port-in Credits2018-0460  
50003357DISCOASBACT50PricingContraActivation/Port-in Credits  2018-0350
50004676DISCOASBACT70PricingContraActivation/Port-in Credits  2018-0370
50004676DISCOASBACT70PricingContraActivation/Port-in Credits2018-0470  
50005846DISCOASBACT60PricingContraActivation/Port-in Credits  2018-0360
50005846DISCOASBACT60PricingContraActivation/Port-in Credits2018-0460  
50006022DISCOASBACT50PricingContraActivation/Port-in Credits  2018-0350
50006022DISCOASBACT50PricingContraActivation/Port-in Credits2018-0450  
50006756DISCOASBACT10PricingContraActivation/Port-in Credits  2018-0310
50006756DISCOASBACT10PricingContraActivation/Port-in Credits2018-0410  

 

 

but what I want is 

 

Product #ADJ_REASON_CODECODEADJ_ACTIVITY1_CATGgl_DESC1ADJ_ACTIVITY2_CATGCMAPRIL_AMOUNTPMMARCH_AMOUNT
50000683DISCOASBACT60PricingContraActivation/Port-in Credits2018-04602018-0360
50001288DISCOASBACT60PricingContraActivation/Port-in Credits  2018-0360
50002333DISCOASBACT60PricingContraActivation/Port-in Credits2018-04602018-0360
50003357DISCOASBACT50PricingContraActivation/Port-in Credits  2018-0350
50004676DISCOASBACT70PricingContraActivation/Port-in Credits2018-04702018-0370
50005846DISCOASBACT60PricingContraActivation/Port-in Credits2018-04602018-0360
50006022DISCOASBACT50PricingContraActivation/Port-in Credits2018-04502018-0350
50006756DISCOASBACT10PricingContraActivation/Port-in Credits2018-04102018-0310
Esteemed Advisor
Posts: 5,532

Re: PROC SQL - How group duplicate rows

Make sure that all your select list columns, except the aggregated columns, are mentioned in the GROUP BY list. Use the CALCULATED keyword to refer to new columns. Follow this structure:

 

proc sql;
create table want as
select 
    x, 
    y, 
    z, 
    r+u as newVar,
    sum(t) as sumVar
from have
group by x, y, z, calculated newVar;
quit;
PG
Contributor
Posts: 34

Re: PROC SQL - How group duplicate rows

 

Hi @PGStats, I updated the 'group by" in the code - see below. still getting the same duplicates. 

 

Product #ADJ_REASON_CODECODEADJ_ACTIVITY1_CATGgl_DESC1ADJ_ACTIVITY2_CATGCMAPRIL_AMOUNTPMMARCH_AMOUNT
50000683DISCOASBACT60PricingContraActivation/Port-in Credits  2018-0360
50000683DISCOASBACT60PricingContraActivation/Port-in Credits2018-0460  
50001288DISCOASBACT60PricingContraActivation/Port-in Credits  2018-0360
50002333DISCOASBACT60PricingContraActivation/Port-in Credits  2018-0360
50002333DISCOASBACT60PricingContraActivation/Port-in Credits2018-0460  

 

proc sql; create table SegmentALL as 
select distinct
SUBSCRIBER_NO,    
ADJ_REASON_CODE,    
DISCOUNT_CODE,    
ADJ_ACTIVITY1_CATG,   
gl_DESC1,  
segment,   
ADJ_ACTIVITY2_CATG,
case repMTD_CD when 'CM' then repMTH end as CM,
case repMTD_CD when 'CM' then SAP_AMOUNT end as APRIL_AMOUNT,
case repMTD_CD when 'CM' then Instances end as APRIL_Transactions,
case repMTD_CD when 'PM' then repMTH end as PM,
case repMTD_CD when 'PM' then SAP_AMOUNT end as MARCH_AMOUNT,
case repMTD_CD when 'PM' then Instances end as MARCH_Transactions 

from CATSegmentRepAll


group by
SUBSCRIBER_NO,    
ADJ_REASON_CODE,    
DISCOUNT_CODE,    
ADJ_ACTIVITY1_CATG,   
gl_DESC1,  
segment,   
ADJ_ACTIVITY2_CATG
;QUIT;

 

Ask a Question
Discussion stats
  • 2 replies
  • 118 views
  • 2 likes
  • 2 in conversation