Hi Experts,
I have a table with columns like
Col1, col2, col3 , col4 , col5 , amount
for each distinct record for the combination of ( Col1, col2, col3 , col4 , col5 ) , I would like to get the count and the total sum ?
Below code is not working.... Can somebody help? Please...
proc sql;
select distinct (Col1, col2, col3 , col4 , col5) , count(*) as number , sum(amount) from tablename group by distinct (Col1, col2, col3 , col4 , col5) ;
quit;
Thanks in advance..
Hi Vijay,
Please find below the code and its output:-
==========================================
data new;
input col1 $ col2 $ col3 $ col4 $ col5 $ amount;
datalines;
a b c d e 23
d e f g h 34
a b g h j 36
a b c d e 45
a b c d e 90
d e f g h 80
a s d f g 70
;
run;
proc sql;
create table summarize as select distinct col1,col2,col3,col4,col5 ,count(*) as Count , sum(amount) as Sum from new group by col1,col2,col3,col4,col5;
quit;
===========================================
Output:-
=====================================
Obs col1 col2 col3 col4 col5 Count Sum
1 a b c d e 3 158
2 a b g h j 1 36
3 a s d f g 1 70
4 d e f g h 2 114
=====================================
I hope this solves your query.
/Daman
I usually have better luck adding a subquery and I am assuming that the dups are existing in the amount column otherwise if its just the other columns the group by, by itself, should return distinct combinations of col1 - col5:
Proc sql;
select col1, col2, col3, col4 , col5, count(*) as number, sum(amount) as amount
from (select distinct col1, col2, col3, col4, col5, amount from tablename)
group by col1, col2, col3, col4, col5
;
quit;
An example of the data may be in order to help explain the issue you are seeing.
EJ
Thanks EJ for your answer. This reply got me to a solution ... !! Thank you!
After running the query I realized that I dont want the Distinct keyword...because
it will remove the duplicates and this will remove the amount also...which is not wanted....like
Sr. No | col1 | col2 | col3 | col4 | col5 | amount |
1 | M | N | O | P | Q | 100 |
2 | M | N | O | P | Q | 100 |
3 | M | N | O | P | Q | 100 |
4 | M | N | O | P | Q | 105 |
5 | X | Y | Z | M | N | 200 |
In above sample dataset example , the inner query -> select distinct col1, col2, col3, col4, col5, amount from tablename
will select only 1 ,4 and 5 row ... as 2 and 3rd are duplicates of 1st
But end result shoudl be like
Sr. No | col1 | col2 | col3 | col4 | col5 | number | amount |
1 | M | N | O | P | Q | 4 | 405 |
2 | X | Y | Z | M | N | 1 | 200 |
So I think the inner query will give wrong results... am I right?
So for my requirement, the following query should be sufficient:
Proc sql;
select col1, col2, col3, col4 , col5, count(*) as number, sum(amount) as amount
from tablename
group by col1, col2, col3, col4, col5
;
quit;
please correct if I am wrong...
I think earlier I didnt frame my question correctly. sorry about that... and thanks for your help!
Hi Vijay,
Please find below the code and its output:-
==========================================
data new;
input col1 $ col2 $ col3 $ col4 $ col5 $ amount;
datalines;
a b c d e 23
d e f g h 34
a b g h j 36
a b c d e 45
a b c d e 90
d e f g h 80
a s d f g 70
;
run;
proc sql;
create table summarize as select distinct col1,col2,col3,col4,col5 ,count(*) as Count , sum(amount) as Sum from new group by col1,col2,col3,col4,col5;
quit;
===========================================
Output:-
=====================================
Obs col1 col2 col3 col4 col5 Count Sum
1 a b c d e 3 158
2 a b g h j 1 36
3 a s d f g 1 70
4 d e f g h 2 114
=====================================
I hope this solves your query.
/Daman
Yes Daman, you are spot on!
Thanks for your reply and answer!
Don't miss out on SAS Innovate - Register now for the FREE Livestream!
Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.