Hi, I need a particular kind of summary, I hope that it would be possible to do:
I have this datatable:
ID | MTO | DATE |
1 | 100 | jul-15 |
2 | 120 | jun-15 |
3 | 130 | may-15 |
1 | 90 | may-15 |
2 | 80 | jun-15 |
3 | 100 | may-15 |
1 | 200 | may-15 |
2 | 110 | jul-15 |
3 | 100 | jul-15 |
I need to get kind of summary, but using data step cause I have too many obs:
ID | Sum(MTO) may-15 | Sum(MTO) jun-15 | Sum(MTO) jul-15 |
1 | 330 | 80 | 200 |
2 | 190 | 120 | 110 |
I was using proc summary, but I could only get a sum without the group by date.
Hi Jerry, thanks for your answer, but there is not totally right:
proc summary output:
id | date | sum_mto |
0 | OCT14 | 0.2611745 |
0 | NOV14 | 0.01 |
0 | DEC14 | 0.08014064 |
0 | FEB15 | 0.47 |
0 | MAR15 | 0.12968991 |
0 | APR15 | 0.37 |
0 | MAY15 | 0.02316756 |
0 | JUN15 | 0.05 |
0 | JUL15 | 0.05017248 |
0 | AUG15 | 0.26142127 |
0 | SEP15 | 0.01620732 |
2 | APR15 | 15.044794 |
2 | JUN15 | 0.0317058 |
83 | DEC14 | 300001.67 |
262 | SEP15 | 31.0078 |
310 | NOV14 | 3257 |
361 | JUN15 | 15000 |
382 | OCT14 | 20.5578741 |
526 | MAR15 | 0.8397922 |
583 | DEC14 | 0.00669792 |
584 | SEP15 | 2.96077683 |
603 | APR15 | 35.11013 |
603 | JUN15 | 69.369564 |
609 | OCT14 | 68.6342 |
693 | JAN15 | 5.609508 |
693 | JUN15 | 25.26848 |
710 | JUN15 | 1734.04944 |
718 | JUL15 | 0.00313578 |
727 | MAY15 | 0.00316756 |
742 | JAN15 | 23.25582 |
793 | MAY15 | 0.00633512 |
881 | SEP15 | 80.5151477 |
921 | AUG15 | 12.31 |
935 | MAY15 | 63.7552 |
948 | DEC14 | 353.77335 |
948 | MAR15 | 18.398973 |
I added var to your code, is it right? BUt, here is the problem, the output is strange:
proc transpose data=summ out=want(drop=_: ) prefix=mto_ ;
by id;
var id date;
run;
The output of this code:
id | mto_1 | mto_2 | mto_3 | mto_4 | mto_5 | mto_6 | mto_7 | mto_8 | mto_9 | mto_10 | mto_11 | mto_12 |
0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | . |
0 | 19997 | 20028 | 20058 | 20121 | 20148 | 20179 | 20209 | 20240 | 20270 | 20301 | 20332 | . |
2 | 2 | 2 | . | . | . | . | . | . | . | . | . | . |
2 | 20179 | 20240 | . | . | . | . | . | . | . | . | . | . |
83 | 83 | . | . | . | . | . | . | . | . | . | . | . |
83 | 20058 | . | . | . | . | . | . | . | . | . | . | . |
262 | 262 | . | . | . | . | . | . | . | . | . | . | . |
262 | 20332 | . | . | . | . | . | . | . | . | . | . | . |
310 | 310 | . | . | . | . | . | . | . | . | . | . | . |
310 | 20028 | . | . | . | . | . | . | . | . | . | . | . |
361 | 361 | . | . | . | . | . | . | . | . | . | . | . |
361 | 20240 | . | . | . | . | . | . | . | . | . | . | . |
382 | 382 | . | . | . | . | . | . | . | . | . | . | . |
382 | 19997 | . | . | . | . | . | . | . | . | . | . | . |
526 | 526 | . | . | . | . | . | . | . | . | . | . | . |
526 | 20148 | . | . | . | . | . | . | . | . | . | . | . |
583 | 583 | . | . | . | . | . | . | . | . | . | . | . |
583 | 20058 | . | . | . | . | . | . | . | . | . | . | . |
584 | 584 | . | . | . | . | . | . | . | . | . | . | . |
584 | 20332 | . | . | . | . | . | . | . | . | . | . | . |
The date was transformed from a ddmmyy format, could it be the problem? And, you know why there are replicated ids? Thanks!
You shouldn't need a Data step for this.
Try something like this...
proc summary data=have nway;
class id date;
var mto;
output out=summ(drop=_: ) sum=;
run;
/* Then to have the month sums as columns... */
proc transpose data=summ out=want(drop=_: ) prefix=mto_ ;
by id;
id date;
run;
Hope this helps.
Hi Jerry, thanks for your answer, but there is not totally right:
proc summary output:
id | date | sum_mto |
0 | OCT14 | 0.2611745 |
0 | NOV14 | 0.01 |
0 | DEC14 | 0.08014064 |
0 | FEB15 | 0.47 |
0 | MAR15 | 0.12968991 |
0 | APR15 | 0.37 |
0 | MAY15 | 0.02316756 |
0 | JUN15 | 0.05 |
0 | JUL15 | 0.05017248 |
0 | AUG15 | 0.26142127 |
0 | SEP15 | 0.01620732 |
2 | APR15 | 15.044794 |
2 | JUN15 | 0.0317058 |
83 | DEC14 | 300001.67 |
262 | SEP15 | 31.0078 |
310 | NOV14 | 3257 |
361 | JUN15 | 15000 |
382 | OCT14 | 20.5578741 |
526 | MAR15 | 0.8397922 |
583 | DEC14 | 0.00669792 |
584 | SEP15 | 2.96077683 |
603 | APR15 | 35.11013 |
603 | JUN15 | 69.369564 |
609 | OCT14 | 68.6342 |
693 | JAN15 | 5.609508 |
693 | JUN15 | 25.26848 |
710 | JUN15 | 1734.04944 |
718 | JUL15 | 0.00313578 |
727 | MAY15 | 0.00316756 |
742 | JAN15 | 23.25582 |
793 | MAY15 | 0.00633512 |
881 | SEP15 | 80.5151477 |
921 | AUG15 | 12.31 |
935 | MAY15 | 63.7552 |
948 | DEC14 | 353.77335 |
948 | MAR15 | 18.398973 |
I added var to your code, is it right? BUt, here is the problem, the output is strange:
proc transpose data=summ out=want(drop=_: ) prefix=mto_ ;
by id;
var id date;
run;
The output of this code:
id | mto_1 | mto_2 | mto_3 | mto_4 | mto_5 | mto_6 | mto_7 | mto_8 | mto_9 | mto_10 | mto_11 | mto_12 |
0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | . |
0 | 19997 | 20028 | 20058 | 20121 | 20148 | 20179 | 20209 | 20240 | 20270 | 20301 | 20332 | . |
2 | 2 | 2 | . | . | . | . | . | . | . | . | . | . |
2 | 20179 | 20240 | . | . | . | . | . | . | . | . | . | . |
83 | 83 | . | . | . | . | . | . | . | . | . | . | . |
83 | 20058 | . | . | . | . | . | . | . | . | . | . | . |
262 | 262 | . | . | . | . | . | . | . | . | . | . | . |
262 | 20332 | . | . | . | . | . | . | . | . | . | . | . |
310 | 310 | . | . | . | . | . | . | . | . | . | . | . |
310 | 20028 | . | . | . | . | . | . | . | . | . | . | . |
361 | 361 | . | . | . | . | . | . | . | . | . | . | . |
361 | 20240 | . | . | . | . | . | . | . | . | . | . | . |
382 | 382 | . | . | . | . | . | . | . | . | . | . | . |
382 | 19997 | . | . | . | . | . | . | . | . | . | . | . |
526 | 526 | . | . | . | . | . | . | . | . | . | . | . |
526 | 20148 | . | . | . | . | . | . | . | . | . | . | . |
583 | 583 | . | . | . | . | . | . | . | . | . | . | . |
583 | 20058 | . | . | . | . | . | . | . | . | . | . | . |
584 | 584 | . | . | . | . | . | . | . | . | . | . | . |
584 | 20332 | . | . | . | . | . | . | . | . | . | . | . |
The date was transformed from a ddmmyy format, could it be the problem? And, you know why there are replicated ids? Thanks!
Not sure why you needed to add the VAR statement. Its implied, but to be explicit, you'd write as:
proc summary data=have nway;
class id date;
var mto;
output out=summ(drop=_: ) sum=;
run;
/* Then to have the month sums as columns... */
proc transpose data=summ out=want(drop=_: ) prefix=mto_ ;
by id;
id date;
var mto;
run;
And this is the result (for me):
ID | mto_MAY15 | mto_JUL15 | mto_JUN15 |
1 | 290 | 100 | . |
2 | . | 110 | 200 |
3 | 230 | 100 | . |
One simple way, assuming your dates are SAS dates.
data have;
input ID MTO DATE :monyy.;
format date monyy.;
datalines;
1 100 jul-15
2 120 jun-15
3 130 may-15
1 90 may-15
2 80 jun-15
3 100 may-15
1 200 may-15
2 110 jul-15
3 100 jul-15
;
proc sql;
create table sumList as
select id,
put(date,monyy7.) as date,
cats("Sum(MTO)", put(date,monyy7.)) as label, sum(mto) as sumMTO
from have
group by id, calculated date, label;
quit;
proc transpose data=sumList out=sumTable(drop=_name_);
by id;
var sumMTO;
id date;
idlabel label;
run;
proc print data=sumTable noobs label; run;
Yes, SQL's flexibility comes at a price. If you can get what you want with proc summary then it is probably faster. I wouldn't expect a huge difference on such a simple task however.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
What’s the difference between SAS Enterprise Guide and SAS Studio? How are they similar? Just ask SAS’ Danny Modlin.
Find more tutorials on the SAS Users YouTube channel.