Hi Everyone,
I want to calculate a time-weighted average for non-zero series and the procedure is as below:
For the data below,
For day 4, ID=1: I have 3 non-zero value:
1 1 6
2 1 43 1 0
4 1 -7
the average will be: (sumproduct of value and day)/(sum of day) = (1*6 + 2*4 + 4*-7)/(1+2+4)
My code below works fine but I know you always have a faster code and I would like to learn about your method.
Thank you,
HHC
data have;
input date id v ;
datalines;
1 1 6
2 1 4
3 1 0
4 1 -7
5 1 5
1 2 1
2 2 0
3 2 -2
4 2 2
5 2 5
6 2 1
;run;
proc sort data=have; by id descending date;run;
data want;
set have;
drop d1 id1 v1;
sum_value=0;
sum_date=0;
do n=_N_ to _N_+3;
set have (rename = (date=d1 id=id1 v=v1)) point = n;
if id=id1 and v1^=0 then do;
sum_value=sum_value+v1*d1;
sum_date=sum_date+d1;
end;
end;
time_weighted_average = sum_value/sum_date;
run;
What you describe is the weighted sum of the observations for which v^=0, assuming that the Date variable is positive. If you don't have to use the DATA step, I would write this as
proc means data=Have mean;
where v ^= 0;
class id;
var v;
weight Date;
run;
If you do have to use the DATA step, then this is an ideal situation to use a BY-group analysis and the FIRST.ID and LAST.ID variables.
data Want;
set Have(where=(v^=0));
by id;
if first.id then do;
sum = 0; sumw = 0;
end;
sum + date*v;
sumw + date;
if last.id then do;
time_weighted_average = sum / sumw;
output;
end;
keep time_weighted_average;
run;
What you describe is the weighted sum of the observations for which v^=0, assuming that the Date variable is positive. If you don't have to use the DATA step, I would write this as
proc means data=Have mean;
where v ^= 0;
class id;
var v;
weight Date;
run;
If you do have to use the DATA step, then this is an ideal situation to use a BY-group analysis and the FIRST.ID and LAST.ID variables.
data Want;
set Have(where=(v^=0));
by id;
if first.id then do;
sum = 0; sumw = 0;
end;
sum + date*v;
sumw + date;
if last.id then do;
time_weighted_average = sum / sumw;
output;
end;
keep time_weighted_average;
run;
Why do you only want to read 4 of the 5 observations for ID = 1?
Check out the Weight Statement of the Proc Summary.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.