Pyrite | Level 9

## Do a calculation for each month

I have a table of the form:

 Customer ID Transaction Date for Buy 1 2016-04-15 1 2016-04-20 3 2016-04-29 3 2016-04-21 3 2016-04-25 4 2017-12-17

From this data, how would I get a summary of the type:

 Customer ID Transaction Date for Buy Number of buys in april 2016 Number of buys in december 2017 1 2016-04-15 2 0 1 2016-04-20 2 0 3 2016-04-29 3 0 3 2016-04-21 3 0 3 2016-04-25 3 0 4 2017-12-17 0 1

So I would like to get the number of buys for each month uptill to today, starting at say january 2021. So there will be very many columns in my real example.

1 ACCEPTED SOLUTION

Accepted Solutions
Super User

## Re: Do a calculation for each month

``````data have;
infile cards expandtabs;
input ID	Date :yymmdd12.;
format Date yymmdd10.;
cards;
1	2016-04-15
1	2016-04-20
3	2016-04-29
3	2016-04-21
3	2016-04-25
4	2017-12-17
;

proc freq data=have noprint;
table id*date/out=want list;
format date yymmn6.;
run;``````
3 REPLIES 3
Super User

## Re: Do a calculation for each month

``````data have;
infile cards expandtabs;
input ID	Date :yymmdd12.;
format Date yymmdd10.;
cards;
1	2016-04-15
1	2016-04-20
3	2016-04-29
3	2016-04-21
3	2016-04-25
4	2017-12-17
;

proc freq data=have noprint;
table id*date/out=want list;
format date yymmn6.;
run;``````
Meteorite | Level 14

## Re: Do a calculation for each month

Hello @SasStatistics

The following code does what you need

``````data have;
infile cards expandtabs;
input ID	Date :yymmdd12.;
format Date yymmdd10.;
cards;
1	2016-04-15
1	2016-04-20
3	2016-04-29
3	2016-04-21
3	2016-04-25
4	2017-12-17
;
proc sql;
create table have2 as
select id,date, year(date) as Year, count(id) as count
from have
group by year(date), id
order by id, date;
run;
proc transpose data=have2 out=want (drop= _NAME_) ;
by id date;
id year;
var count;
run;
data want;
set want;
array change _numeric_;
do over change;
if change=. then change=0;
end;
run ;``````

The out put will be as you wanted.

PROC Star

## Re: Do a calculation for each month

This can be done in a single data step, by reading each ID twice, once to get the counts, second time to output the original data with those counts.

``````data have;
infile cards expandtabs;
input ID	Date :yymmdd12.;
format Date yymmdd10.;
cards;
1	2016-04-15
1	2016-04-20
3	2016-04-29
3	2016-04-21
3	2016-04-25
4	2017-12-17
;

data want (drop=_: nxt_:);
do until (id^=nxt_id);
do _ntrans=1 by 1 until (intck('month',date,nxt_date)^=0 or id^=nxt_id);
merge have
have (firstobs=2 keep=id date rename=(id=nxt_id date=nxt_date));
array mnths{4:12} april_trans _unwanted5-_unwanted11 dec_trans;
end;
mnths{month(date)}=_ntrans;
end;

do _m=4,12; if  mnths{_m}=. then mnths{_m}=0; end;

do until (last.id);
set have;
by id notsorted;
output;
end;
run;``````

This program assumes that the data are grouped by id and within id, grouped by month.  The ID's (and the months within ID's) need not be in ascending order.  The technique in determining month and/or id boundaries is in the merge statement, using the "firstobs=2" parameter for one of the merged data sequences.

It uses the ARRAY statement to index the array MNTHS by month number.  Since you only want April and December, the array has lower index of 4 and upper index of 12.  And it has a bunch of unwanted variables as placeholders for May through November.   If you want other months just make changes to the variable names and/or array index range. And modify the "do _m=4,12;" loop accordingly.  It wouldn't be hard to use the array for months ranging over multiple years - just add a second dimension for year.

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
Discussion stats
• 3 replies
• 599 views
• 3 likes
• 4 in conversation