Hi,
Would anyone know how to, by using PROC SQL merge duplicate variables rows and this would result in a sum up of one column?
tried SELECT DISTINCT and GROUP BY but I am still receiving multiple duplicates.
Get duplicate rows showing for each Product #... don't want to use PROC SORT nodupkey -> it will remove all but 1 and the amount for Jan will not be correct...
I want the Product # and code to be distinct - if there are amounts in both months then it should be one line.
please HELP - see code below
proc sql;
create table TRANSPOSE_VIEW as
select distinct
Product,
ADJ_REASON_CODE,
CODE,
ADJ_ACTIVITY1_CATG,
gl_DESC1,
ADJ_ACTIVITY2_CATG,
case repMTD_CD when 'CM' then repMTH end as CM,
case repMTD_CD when 'CM' then SAP_AMOUNT end as JAN_AMOUNT,
case repMTD_CD when 'PM' then repMTH end as PM,
case repMTD_CD when 'PM' then SAP_AMOUNT end as FRB_AMOUNT
from CATSegmentRepAll
group by
Product,
ADJ_REASON_CODE,
CODE,
ADJ_ACTIVITY1_CATG,
ADJ_ACTIVITY2_CATG
;QUIT;
this is what I am getting
Product # | ADJ_REASON_CODE | CODE | ADJ_ACTIVITY1_CATG | gl_DESC1 | ADJ_ACTIVITY2_CATG | CM | APRIL_AMOUNT | PM | MARCH_AMOUNT |
50000683 | DISCOA | SBACT60 | Pricing | Contra | Activation/Port-in Credits | 2018-03 | 60 | ||
50000683 | DISCOA | SBACT60 | Pricing | Contra | Activation/Port-in Credits | 2018-04 | 60 | ||
50001288 | DISCOA | SBACT60 | Pricing | Contra | Activation/Port-in Credits | 2018-03 | 60 | ||
50002333 | DISCOA | SBACT60 | Pricing | Contra | Activation/Port-in Credits | 2018-03 | 60 | ||
50002333 | DISCOA | SBACT60 | Pricing | Contra | Activation/Port-in Credits | 2018-04 | 60 | ||
50003357 | DISCOA | SBACT50 | Pricing | Contra | Activation/Port-in Credits | 2018-03 | 50 | ||
50004676 | DISCOA | SBACT70 | Pricing | Contra | Activation/Port-in Credits | 2018-03 | 70 | ||
50004676 | DISCOA | SBACT70 | Pricing | Contra | Activation/Port-in Credits | 2018-04 | 70 | ||
50005846 | DISCOA | SBACT60 | Pricing | Contra | Activation/Port-in Credits | 2018-03 | 60 | ||
50005846 | DISCOA | SBACT60 | Pricing | Contra | Activation/Port-in Credits | 2018-04 | 60 | ||
50006022 | DISCOA | SBACT50 | Pricing | Contra | Activation/Port-in Credits | 2018-03 | 50 | ||
50006022 | DISCOA | SBACT50 | Pricing | Contra | Activation/Port-in Credits | 2018-04 | 50 | ||
50006756 | DISCOA | SBACT10 | Pricing | Contra | Activation/Port-in Credits | 2018-03 | 10 | ||
50006756 | DISCOA | SBACT10 | Pricing | Contra | Activation/Port-in Credits | 2018-04 | 10 |
but what I want is
Product # | ADJ_REASON_CODE | CODE | ADJ_ACTIVITY1_CATG | gl_DESC1 | ADJ_ACTIVITY2_CATG | CM | APRIL_AMOUNT | PM | MARCH_AMOUNT |
50000683 | DISCOA | SBACT60 | Pricing | Contra | Activation/Port-in Credits | 2018-04 | 60 | 2018-03 | 60 |
50001288 | DISCOA | SBACT60 | Pricing | Contra | Activation/Port-in Credits | 2018-03 | 60 | ||
50002333 | DISCOA | SBACT60 | Pricing | Contra | Activation/Port-in Credits | 2018-04 | 60 | 2018-03 | 60 |
50003357 | DISCOA | SBACT50 | Pricing | Contra | Activation/Port-in Credits | 2018-03 | 50 | ||
50004676 | DISCOA | SBACT70 | Pricing | Contra | Activation/Port-in Credits | 2018-04 | 70 | 2018-03 | 70 |
50005846 | DISCOA | SBACT60 | Pricing | Contra | Activation/Port-in Credits | 2018-04 | 60 | 2018-03 | 60 |
50006022 | DISCOA | SBACT50 | Pricing | Contra | Activation/Port-in Credits | 2018-04 | 50 | 2018-03 | 50 |
50006756 | DISCOA | SBACT10 | Pricing | Contra | Activation/Port-in Credits | 2018-04 | 10 | 2018-03 | 10 |
Make sure that all your select list columns, except the aggregated columns, are mentioned in the GROUP BY list. Use the CALCULATED keyword to refer to new columns. Follow this structure:
proc sql;
create table want as
select
x,
y,
z,
r+u as newVar,
sum(t) as sumVar
from have
group by x, y, z, calculated newVar;
quit;
Make sure that all your select list columns, except the aggregated columns, are mentioned in the GROUP BY list. Use the CALCULATED keyword to refer to new columns. Follow this structure:
proc sql;
create table want as
select
x,
y,
z,
r+u as newVar,
sum(t) as sumVar
from have
group by x, y, z, calculated newVar;
quit;
Hi @PGStats, I updated the 'group by" in the code - see below. still getting the same duplicates.
Product # ADJ_REASON_CODE CODE ADJ_ACTIVITY1_CATG gl_DESC1 ADJ_ACTIVITY2_CATG CM APRIL_AMOUNT PM MARCH_AMOUNT 50000683 DISCOA SBACT60 Pricing Contra Activation/Port-in Credits 2018-03 60 50000683 DISCOA SBACT60 Pricing Contra Activation/Port-in Credits 2018-04 60 50001288 DISCOA SBACT60 Pricing Contra Activation/Port-in Credits 2018-03 60 50002333 DISCOA SBACT60 Pricing Contra Activation/Port-in Credits 2018-03 60 50002333 DISCOA SBACT60 Pricing Contra Activation/Port-in Credits 2018-04 60
proc sql; create table SegmentALL as
select distinct
SUBSCRIBER_NO,
ADJ_REASON_CODE,
DISCOUNT_CODE,
ADJ_ACTIVITY1_CATG,
gl_DESC1,
segment,
ADJ_ACTIVITY2_CATG,
case repMTD_CD when 'CM' then repMTH end as CM,
case repMTD_CD when 'CM' then SAP_AMOUNT end as APRIL_AMOUNT,
case repMTD_CD when 'CM' then Instances end as APRIL_Transactions,
case repMTD_CD when 'PM' then repMTH end as PM,
case repMTD_CD when 'PM' then SAP_AMOUNT end as MARCH_AMOUNT,
case repMTD_CD when 'PM' then Instances end as MARCH_Transactions
from CATSegmentRepAll
group by
SUBSCRIBER_NO,
ADJ_REASON_CODE,
DISCOUNT_CODE,
ADJ_ACTIVITY1_CATG,
gl_DESC1,
segment,
ADJ_ACTIVITY2_CATG
;QUIT;
Available on demand!
Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.