Hi everybody,
I have data on dates and costs of visits, and I need to add them up over individuals seperated into months.
Dummy data is something like this:
Have:
Id Date Cost Visit_id
100 10sep2005 1000 98
105 20sep2005 3000 103
105 22sep2005 1000 105
200 01oct2005 2000 123
200 03nov2005 1500 201
200 03nov2005 500 202
Want:
Id Month Visits Cost
100 sep2005 1 1000
105 sep2005 2 4000
200 oct2005 1 2000
200 nov2005 2 2000
I've tried using the ".first" and ".last" properties when creating a loop, but I can't figure out exactly how to do it.
Any help would be greatly appreciated.
While the sql solutions will give you what you want, you did ask how you might be able to do it in a datastep. Here is one way:
data want;
set have;
month=put(date,monyy7.);
run;
data want (drop=date visit_id in_cost);
do until (lastone);
set want (rename=(cost=in_cost)) end=lastone;
by id month notsorted;
if first.month then do;
cost=in_cost;
visits=1;
end;
else do;
cost+in_cost;
visits+1;
end;
if last.month then output;
end;
run;
First convert your data to a month-year column, perhaps using Put(date,monyy.).
Then you use SQL with group by id and month, and using count(*) for visits.
Or proc summary, where the _FREQ_ output column will be your visits (you can rename in the output statement).
/Linus
data have;
informat date date9. ;
format date date9. month monyy7.;
input id date cost visit_id;
month=input(put(date,monyy7.),monyy7.);
cards;
100 10sep2005 1000 98
105 20sep2005 3000 103
105 22sep2005 1000 105
200 01oct2005 2000 123
200 03nov2005 1500 201
200 03nov2005 500 202
;
run;
proc sql;
create table want as select id, month, count(*) as visits,sum(cost) as cost
from have
group by id,month
order by id,month;
quit;
While the sql solutions will give you what you want, you did ask how you might be able to do it in a datastep. Here is one way:
data want;
set have;
month=put(date,monyy7.);
run;
data want (drop=date visit_id in_cost);
do until (lastone);
set want (rename=(cost=in_cost)) end=lastone;
by id month notsorted;
if first.month then do;
cost=in_cost;
visits=1;
end;
else do;
cost+in_cost;
visits+1;
end;
if last.month then output;
end;
run;
data have; input id date : date9. cost visit_id; cards; 100 10sep2005 1000 98 105 20sep2005 3000 103 105 22sep2005 1000 105 200 01oct2005 2000 123 200 03nov2005 1500 201 200 03nov2005 500 202 ; run; proc summary data=have nway; class id date; format date monyy7.; var cost ; output out=want(drop=_:) n=visit sum=; run;
Ksharp
Thanks for all the help, guys. Highly appreciated, I see there are many ways to do this. I have a lot to learn!
like the art297 way ....
... to "gild that lilly"
data want (keep= id month visits cost );
call missing( cost ) ;
do visits=1 by 1 until (last.month);
set want (rename=(cost=in_cost)) ;
by id month notsorted;
cost + in_cost ;
end;
run;
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.