Solved
Contributor
Posts: 69

# Summary by variable

Hi, I need a particular kind of summary, I hope that it would be possible to do:

I have this datatable:

 ID MTO DATE 1 100 jul-15 2 120 jun-15 3 130 may-15 1 90 may-15 2 80 jun-15 3 100 may-15 1 200 may-15 2 110 jul-15 3 100 jul-15

I need to get kind of summary, but using data step cause I have too many obs:

 ID Sum(MTO) may-15 Sum(MTO) jun-15 Sum(MTO) jul-15 1 330 80 200 2 190 120 110

I was using proc summary, but I could only get a sum without the group by date.

Accepted Solutions
Solution
‎10-22-2015 12:29 PM
Contributor
Posts: 69

## Re: Summary by variable

[ Edited ]

Hi Jerry, thanks for your answer, but there is not totally right:

proc summary output:

 id date sum_mto 0 OCT14 0.2611745 0 NOV14 0.01 0 DEC14 0.08014064 0 FEB15 0.47 0 MAR15 0.12968991 0 APR15 0.37 0 MAY15 0.02316756 0 JUN15 0.05 0 JUL15 0.05017248 0 AUG15 0.26142127 0 SEP15 0.01620732 2 APR15 15.044794 2 JUN15 0.0317058 83 DEC14 300001.67 262 SEP15 31.0078 310 NOV14 3257 361 JUN15 15000 382 OCT14 20.5578741 526 MAR15 0.8397922 583 DEC14 0.00669792 584 SEP15 2.96077683 603 APR15 35.11013 603 JUN15 69.369564 609 OCT14 68.6342 693 JAN15 5.609508 693 JUN15 25.26848 710 JUN15 1734.04944 718 JUL15 0.00313578 727 MAY15 0.00316756 742 JAN15 23.25582 793 MAY15 0.00633512 881 SEP15 80.5151477 921 AUG15 12.31 935 MAY15 63.7552 948 DEC14 353.77335 948 MAR15 18.398973

I added var to your code, is it right? BUt, here is the problem, the output is strange:

proc transpose data=summ out=want(drop=_: )  prefix=mto_ ;

by  id;

var id  date;

run;

The output of this code:

 id mto_1 mto_2 mto_3 mto_4 mto_5 mto_6 mto_7 mto_8 mto_9 mto_10 mto_11 mto_12 0 0 0 0 0 0 0 0 0 0 0 0 . 0 19997 20028 20058 20121 20148 20179 20209 20240 20270 20301 20332 . 2 2 2 . . . . . . . . . . 2 20179 20240 . . . . . . . . . . 83 83 . . . . . . . . . . . 83 20058 . . . . . . . . . . . 262 262 . . . . . . . . . . . 262 20332 . . . . . . . . . . . 310 310 . . . . . . . . . . . 310 20028 . . . . . . . . . . . 361 361 . . . . . . . . . . . 361 20240 . . . . . . . . . . . 382 382 . . . . . . . . . . . 382 19997 . . . . . . . . . . . 526 526 . . . . . . . . . . . 526 20148 . . . . . . . . . . . 583 583 . . . . . . . . . . . 583 20058 . . . . . . . . . . . 584 584 . . . . . . . . . . . 584 20332 . . . . . . . . . . .

The date was transformed from a ddmmyy format, could it be the problem? And, you know why there are replicated ids? Thanks!

All Replies
Frequent Contributor
Posts: 88

## Re: Summary by variable

[ Edited ]

You shouldn't need a Data step for this.

Try something like this...

proc summary data=have nway;

class id date;

var  mto;

output out=summ(drop=_: )  sum=;

run;

/* Then to have the month sums as columns...  */

proc transpose data=summ out=want(drop=_: )  prefix=mto_ ;

by  id;

id  date;

run;

Hope this helps.

Solution
‎10-22-2015 12:29 PM
Contributor
Posts: 69

## Re: Summary by variable

[ Edited ]

Hi Jerry, thanks for your answer, but there is not totally right:

proc summary output:

 id date sum_mto 0 OCT14 0.2611745 0 NOV14 0.01 0 DEC14 0.08014064 0 FEB15 0.47 0 MAR15 0.12968991 0 APR15 0.37 0 MAY15 0.02316756 0 JUN15 0.05 0 JUL15 0.05017248 0 AUG15 0.26142127 0 SEP15 0.01620732 2 APR15 15.044794 2 JUN15 0.0317058 83 DEC14 300001.67 262 SEP15 31.0078 310 NOV14 3257 361 JUN15 15000 382 OCT14 20.5578741 526 MAR15 0.8397922 583 DEC14 0.00669792 584 SEP15 2.96077683 603 APR15 35.11013 603 JUN15 69.369564 609 OCT14 68.6342 693 JAN15 5.609508 693 JUN15 25.26848 710 JUN15 1734.04944 718 JUL15 0.00313578 727 MAY15 0.00316756 742 JAN15 23.25582 793 MAY15 0.00633512 881 SEP15 80.5151477 921 AUG15 12.31 935 MAY15 63.7552 948 DEC14 353.77335 948 MAR15 18.398973

I added var to your code, is it right? BUt, here is the problem, the output is strange:

proc transpose data=summ out=want(drop=_: )  prefix=mto_ ;

by  id;

var id  date;

run;

The output of this code:

 id mto_1 mto_2 mto_3 mto_4 mto_5 mto_6 mto_7 mto_8 mto_9 mto_10 mto_11 mto_12 0 0 0 0 0 0 0 0 0 0 0 0 . 0 19997 20028 20058 20121 20148 20179 20209 20240 20270 20301 20332 . 2 2 2 . . . . . . . . . . 2 20179 20240 . . . . . . . . . . 83 83 . . . . . . . . . . . 83 20058 . . . . . . . . . . . 262 262 . . . . . . . . . . . 262 20332 . . . . . . . . . . . 310 310 . . . . . . . . . . . 310 20028 . . . . . . . . . . . 361 361 . . . . . . . . . . . 361 20240 . . . . . . . . . . . 382 382 . . . . . . . . . . . 382 19997 . . . . . . . . . . . 526 526 . . . . . . . . . . . 526 20148 . . . . . . . . . . . 583 583 . . . . . . . . . . . 583 20058 . . . . . . . . . . . 584 584 . . . . . . . . . . . 584 20332 . . . . . . . . . . .

The date was transformed from a ddmmyy format, could it be the problem? And, you know why there are replicated ids? Thanks!

Frequent Contributor
Posts: 88

## Re: Summary by variable

Not sure why you needed to add the VAR statement.   Its implied, but to be explicit, you'd write as:

``````proc summary data=have nway;
class id date;
var  mto;
output out=summ(drop=_: )  sum=;
run;

/* Then to have the month sums as columns...  */
proc transpose data=summ out=want(drop=_: )  prefix=mto_ ;
by  id;
id  date;
var mto;
run;``````

And this is the result (for me):

 ID mto_MAY15 mto_JUL15 mto_JUN15 1 290 100 . 2 . 110 200 3 230 100 .

Contributor
Posts: 69

## Re: Summary by variable

My bad, I just see my error, id is a parameter, I was confused because I have the same name for my variable...it works perfect!
Posts: 5,619

## Re: Summary by variable

One simple way, assuming your dates are SAS dates.

``````data have;
input ID	MTO	DATE :monyy.;
format date monyy.;
datalines;
1	100	jul-15
2	120	jun-15
3	130	may-15
1	90	may-15
2	80	jun-15
3	100	may-15
1	200	may-15
2	110	jul-15
3	100	jul-15
;

proc sql;
create table sumList as
select id,
put(date,monyy7.) as date,
cats("Sum(MTO)", put(date,monyy7.)) as label, sum(mto) as sumMTO
from have
group by id, calculated date, label;
quit;

proc transpose data=sumList out=sumTable(drop=_name_);
by id;
var sumMTO;
id date;
idlabel label;
run;

proc print data=sumTable noobs label; run;``````
PG
Contributor
Posts: 69

## Re: Summary by variable

Hi PG, a friend told me that proc sql is a method that take more time, is that truth? What would you choose between proc summary and proc sql?? Because I have about 500 million obs.
Posts: 5,619

## Re: Summary by variable

Yes, SQL's flexibility comes at a price. If you can get what you want with proc summary then it is probably faster. I wouldn't expect a huge difference on such a simple task however.

PG
Contributor
Posts: 69

## Re: Summary by variable

Thanks PG. I couldn't select your answer as solution. It seems you could choose only one.
Contributor
Posts: 69

## Re: Summary by variable

Pg, thanks, I tried your solution and it works perfectly!
🔒 This topic is solved and locked.