Hi Community,
I am trying to find a way to solve an issues below mentioned.
Raw data:
ID | Month | Cost |
1 | Jan-07 | 5 |
1 | Jan-07 | 5 |
1 | Mar-07 | 5 |
1 | May-07 | 5 |
2 | Aug-08 | 6 |
2 | Aug-08 | 6 |
2 | Dec-08 | 13 |
Logic:
The requirement has 2 tasks
1) Get the Minimum and Maximum of Month variable by ID. For Ex: for ID 1 Min is Jan-07 and Max is May-07.
2) Create a sequence of month-year from Minimum to Maximum. For Ex: for ID 1 The output starts at Jan-07 and ends at May-07, Even if raw data doesn't include Feb-07,April-07 the final output needs all the missing months to be added in sequence.
3) Sum the costs by ID and month and If no costs present then assign 0. For Ex: For ID 1 the Jan-07 costs add up to 10 and the output has 10 under the cost variables. and if not costs present to add then 0 is expected under the cells.
Output:
ID | Month | Cost |
1 | Jan-07 | 10 |
1 | Feb-07 | 0 |
1 | Mar-07 | 5 |
1 | Apr-07 | 0 |
1 | May-07 | 5 |
2 | Aug-08 | 12 |
2 | Sep-08 | 0 |
2 | Oct-08 | 0 |
2 | Nov-08 | 0 |
2 | Dec-08 | 13 |
I am trying to solve this in Proc Sql but unable to get it work.
Any suggestions.
Thank you for your time.
Hi @shasank
data have;
input ID Month :monyy7. Cost;
format month monyy7.;
cards;
1 Jan-07 5
1 Jan-07 5
1 Mar-07 5
1 May-07 5
2 Aug-08 6
2 Aug-08 6
2 Dec-08 13
;
run;
data want;
if _n_=1 then do;
dcl hash H () ;
h.definekey ("month") ;
h.definedata ("cost") ;
h.definedone () ;
end;
do until(last.id);
set have;
by id month;
if first.month then s=0;
s=sum(cost,s);
if last.month then h.add(key:month,data:s);
_min=min(_min,month);
_max=max(_max,month);
end;
month=_min;
do while(month<=_max);
if h.find() ne 0 then cost=0;
output;
month=intnx('mon',month,1);
end;
h.clear();
drop s _:;
run;
Hi Shasank,
I am not sure if I understood what you wanted exactly, but I tried something that reached the same output as you wanted. Hope this helps!
data cost;
input ID Month $ Cost;
cards;
1 Jan-07 5
1 Jan-07 5
1 Mar-07 5
1 May-07 5
2 Aug-08 6
2 Aug-08 6
2 Dec-08 13
;
run;
proc format;
value mth 1='Jan-07'
2='Feb-07'
3='Mar-07'
4='Apr-07'
5='May-07'
8='Aug-08'
9='Sep-08'
10='Oct-08'
11='Nov-08'
12='Dec-08';
run;
data cmonth (drop=i);
do i = 1 to 12;
if i le 5 then do;
month1=i;
id=1;
output;
end;
else if i ge 8 then do;
month1=i;
id=2;
output;
end;
end;
run;
proc sql;
create table mreport as
select A.ID, A.month1 format=mth. as month, sum(B.cost) as cost
from cmonth as A full join cost as B
on A.ID=B.ID and A.month1=month(input(cats('01',substr(B.month, 1,3),'1960'), date9.))
group by a.id, month1;
update mreport
set cost=case when cost=. then 0 else cost end;
quit;
Hi @shasank
data have;
input ID Month :monyy7. Cost;
format month monyy7.;
cards;
1 Jan-07 5
1 Jan-07 5
1 Mar-07 5
1 May-07 5
2 Aug-08 6
2 Aug-08 6
2 Dec-08 13
;
run;
data want;
if _n_=1 then do;
dcl hash H () ;
h.definekey ("month") ;
h.definedata ("cost") ;
h.definedone () ;
end;
do until(last.id);
set have;
by id month;
if first.month then s=0;
s=sum(cost,s);
if last.month then h.add(key:month,data:s);
_min=min(_min,month);
_max=max(_max,month);
end;
month=_min;
do while(month<=_max);
if h.find() ne 0 then cost=0;
output;
month=intnx('mon',month,1);
end;
h.clear();
drop s _:;
run;
You could however by defining an array and sum all the measures, albeit that's gonna complicate things.
Alternatively, Just get the full ID Month data first and then merge back with the original. So the nonmatches can have value of 0.
data have;
input ID Month :monyy7. Cost;
format month monyy7.;
cards;
1 Jan-07 5
1 Jan-07 5
1 Mar-07 5
1 May-07 5
2 Aug-08 6
2 Aug-08 6
2 Dec-08 13
;
run;
data want;
do until(last.id);
set have;
by id;
_min=min(_min,month);
_max=max(_max,month);
end;
month=_min;
do while(month<=_max);
output;
month=intnx('mon',month,1);
end;
keep id month;
run;
Finally, once you have the merged complete data in place, then you are all set for a simple proc summary. So, use the above result to merge back with the original and then summarize using proc summary or SQL whichever is convenient for you.
data have;
input ID Month :monyy7. Cost;
format month monyy7.;
cards;
1 Jan-07 5
1 Jan-07 5
1 Mar-07 5
1 May-07 5
2 Aug-08 6
2 Aug-08 6
2 Dec-08 13
;
run;
proc summary data=have ;
by id month;
var cost;
output out=temp sum=;
run;
data want;
merge temp temp(keep=id month rename=(id=_id month=_month) firstobs=2);
output;
if id=_id then do;
do i=1 to intck('month',month,_month)-1;
month=intnx('month',month,1);cost=0;output;
end;
end;
drop i _:;
run;
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.