Hi Everyone,
I want to calculate a time-weighted average for non-zero series and the procedure is as below:
For the data below,
For day 4, ID=1: I have 3 non-zero value:
1 1 6
2 1 43 1 0
4 1 -7
the average will be: (sumproduct of value and day)/(sum of day) = (1*6 + 2*4 + 4*-7)/(1+2+4)
My code below works fine but I know you always have a faster code and I would like to learn about your method.
Thank you,
HHC
data have;
input date id v ;
datalines;
1 1 6
2 1 4
3 1 0
4 1 -7
5 1 5
1 2 1
2 2 0
3 2 -2
4 2 2
5 2 5
6 2 1
;run;
proc sort data=have; by id descending date;run;
data want;
set have;
drop d1 id1 v1;
sum_value=0;
sum_date=0;
do n=_N_ to _N_+3;
set have (rename = (date=d1 id=id1 v=v1)) point = n;
if id=id1 and v1^=0 then do;
sum_value=sum_value+v1*d1;
sum_date=sum_date+d1;
end;
end;
time_weighted_average = sum_value/sum_date;
run;
What you describe is the weighted sum of the observations for which v^=0, assuming that the Date variable is positive. If you don't have to use the DATA step, I would write this as
proc means data=Have mean;
where v ^= 0;
class id;
var v;
weight Date;
run;
If you do have to use the DATA step, then this is an ideal situation to use a BY-group analysis and the FIRST.ID and LAST.ID variables.
data Want;
set Have(where=(v^=0));
by id;
if first.id then do;
sum = 0; sumw = 0;
end;
sum + date*v;
sumw + date;
if last.id then do;
time_weighted_average = sum / sumw;
output;
end;
keep time_weighted_average;
run;
What you describe is the weighted sum of the observations for which v^=0, assuming that the Date variable is positive. If you don't have to use the DATA step, I would write this as
proc means data=Have mean;
where v ^= 0;
class id;
var v;
weight Date;
run;
If you do have to use the DATA step, then this is an ideal situation to use a BY-group analysis and the FIRST.ID and LAST.ID variables.
data Want;
set Have(where=(v^=0));
by id;
if first.id then do;
sum = 0; sumw = 0;
end;
sum + date*v;
sumw + date;
if last.id then do;
time_weighted_average = sum / sumw;
output;
end;
keep time_weighted_average;
run;
Why do you only want to read 4 of the 5 observations for ID = 1?
Check out the Weight Statement of the Proc Summary.
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.