Hi,
Can someone help me with below issue.
When i use sum and group by in proc sql with or without distinct, i am not getting unique records.
As you can see from below output image, i should only get 3 rows in my output. How can i get rid of 2nd or 3rd row from my output.
data have;
input code $1-3 tcode $5-8 order_date :date9. invoice_date :date9. invoice_code $30-33 customer $35-39 Cust_code $41-44
P_code $46-50 P_name $52-60 H_code $62-65 T_code $67-72 Plant $74-77 Cat_code $79-81 Cat_desc $83 Item_code $85-87
Item_desc $89-98 Distance 100-102 Qty 104-107 HPrice 109-112 Rev 114-119;
format order_date date9. invoice_date date9.;
;
datalines;
101 1019 21NOV2023 25NOV2023 2626 B11AC B100 52486 New_B11AC EAST GNA1UC 100A 200 H 380 HPayDry 9.3 0 0.00 0
101 1019 21NOV2023 25NOV2023 2626 B11AC B100 52486 New_B11AC EAST GNA1UC 100A 200 H 370 HChargeDry 9.3 2.14 0.00 94.25
101 1019 21NOV2023 25NOV2023 2626 B11AC B100 52486 New_B11AC EAST GNA1UC 100A 200 H 370 HChargeDry 9.3 2.14 0.00 113.36
101 1019 21NOV2023 25NOV2023 2626 B11AC B100 52486 New_B11AC EAST GNA1UC 100A 114 S 520 0/2_catP3 9.3 2.14 4.35 169.38
;
proc sql;
create table want (drop=Qty Rev) as
select*,
sum(Qty) as P_Qty,
sum(Rev) as Revenue
from have
group by cat_code, Cat_desc, item_code,item_desc
;
quit;
current output what i am getting
Expected output when i use sum and group by i should get only 3 rows as per the requirement. Row 2 & 3 are same, how should i get rid of one row.
Sum issue : for sample purpose i have manually shown few records here, when i use the sum function on a large dataset i end getting a sum on complete data rather then group by. Below image shows an issue with sum on revenue and qty when i use a sum function and group by in proc sql. In reality it should only give the sum based on grouping. Below sum is not right, it won't be this high number.
Thanks,
vnreddy
You asked SAS to return ALL of the variables. So that will require ALL of the observations.
If you want just the summary results and the group by variables then you need to only ask for those variables.
create table want as
select cat_code, Cat_desc, item_code, item_desc
, sum(Qty) as P_Qty
, sum(Rev) as Revenue
from have
group by cat_code, Cat_desc, item_code,item_desc
;
PS Don't hide those continuation commas at the ends of the lines. It is much harder to scan for them there since the right side of the lines is jagged.
Then SAS did what you wanted.
Try it with SASHELP.CLASS.
proc sql;
select *,mean(age) as Gender_Mean
from sashelp.class
group by sex
;
quit;
As you can see you get ALL of the observations and the new variable has the same value for all observations that share the same values of the GROUP BY variables.
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.