I would like to calculate weighted averages across 37 weeks, where each average includes all previous weeks, inversely weighted by their distance from the current week.
My data is in wide format where each observation is one person, with variables wk1 - wk37 as weekly values.
This is what I would like to accomplish:
data test;
avg1 = .;
avg2 = wk1;
avg3 = (( wk2*(2/3) + wk1*(1/3)) / 2);
avg4 = (( wk3*(3/4) + wk2*(2/4) + wk1*(1/4)) / 3) ;
avg5 = (( wk4*(4/5) + wk3*(3/5) + wk2*(2/5) + wk1*(1/5)) / 4);
avg6 = (( wk5*(5/6) + wk4*(4/6) + wk3*(3/6) + wk2*(2/6) + wk1*(1/6)) / 5);
avg7 = (( wk6*(6/7) + wk5*(5/7) + wk4*(4/7) + wk3*(3/7) + wk2*(2/7)) + wk1(1/7)) / 6);
avg8 = (( wk7*(7/8) + wk6*(6/8) + wk5*(5/8) + wk4*(4/8) + wk3*(3/8) + wk2*(2/8) + wk1*(1/8) / 7;
etc. ....
Thank you!
If you do..
I think the easiest way to solve this problem is to think of it in terms of vectors and matrices and use the IML language.
A way to go about is is like this, which is easily extendable to 37 weeks..
Feel free to ask again if you have questions 🙂
data have;
input personID wk1-wk5;
datalines;
1 23 61 31 86 32
2 24 69 34 83 31
3 21 65 33 85 35
4 28 63 33 81 34
;
proc iml;
use have;
read all var ("wk1":"wk5") into W;
read all var {personID} into personID;
close have;
Avg=j(nrow(W),ncol(W),.);
Avg[ ,2] = W[ ,1];
do i=1 to nrow(W);
do j=3 to 5;
v=do(1/j, (j-1)/j, 1/j);
Avg[i,j]= sum(v#W[i,1:j-1])/(j-1);
end;
end;
Avg=personID||Avg;
varNames= "personID"||("avg1":"avg5");
create want from Avg[c=varNames];
append from Avg;
close want;
quit;
proc print data=want;
run;
Do you have a SAS/ETS license?
Yes
Sorry. I meant SAS/IML. Do you have a SAS/IML license?
If you do..
I think the easiest way to solve this problem is to think of it in terms of vectors and matrices and use the IML language.
A way to go about is is like this, which is easily extendable to 37 weeks..
Feel free to ask again if you have questions 🙂
data have;
input personID wk1-wk5;
datalines;
1 23 61 31 86 32
2 24 69 34 83 31
3 21 65 33 85 35
4 28 63 33 81 34
;
proc iml;
use have;
read all var ("wk1":"wk5") into W;
read all var {personID} into personID;
close have;
Avg=j(nrow(W),ncol(W),.);
Avg[ ,2] = W[ ,1];
do i=1 to nrow(W);
do j=3 to 5;
v=do(1/j, (j-1)/j, 1/j);
Avg[i,j]= sum(v#W[i,1:j-1])/(j-1);
end;
end;
Avg=personID||Avg;
varNames= "personID"||("avg1":"avg5");
create want from Avg[c=varNames];
append from Avg;
close want;
quit;
proc print data=want;
run;
This worked, thank you!
@Reeza, @PGStats, @Astounding can I ask for a minute of your time?
Am I overthinking this problem? An IML approach is the only solution I could think of and it worked, but I feel like there is a simpler way?
Thank you in advance.
I would expect you can use arrays to do this. I'm just being lazy on a Sunday ... here's the structure but without a key formula that needs to be filled in.
data want;
set have;
array wk {37};
array avg {37};
do j=2 to 37;
avg_numer = 0;
do k=1 to j-1;
* increment avg_numer;
end;
avg{j} = avg_numer / (j-1);
end;
Arrays?
data want;
array wk{37};
array avg{37};
set have;
do i = 1 to dim(wk);
do j = 1 to i-1;
avg{i} = sum(avg{i}, wk{j} * j / i);
end;
if i > 1 then avg{i} = avg{i} / (i-1);
end;
drop i j;
run;
This works for my previous method - however now I want to modify my formula so that:
avg(wk4) = wk3*3 + wk2*2 + wk1*1 / 3!
I attempted to modify the code but it is giving me this:
wk1 | 15.3763 | avg1 | . |
wk2 | 12.1931 | avg2 | 15.3763 |
wk3 | 10.2969 | avg3 | 19.8813 |
wk4 | 9.77625 | avg4 | 11.7755 |
wk5 | 8.708 | avg5 | 4.57326 |
wk6 | 15.4319 | avg6 | 1.27748 |
wk7 | 38.5981 | avg7 | 0.34151 |
wk8 | 10.0525 | avg8 | 0.1024 |
wk9 | 8.78688 | avg9 | 0.01479 |
wk10 | 10.6669 | avg10 | 0.00186 |
wk11 | 10.2844 | avg11 | 0.00022 |
wk12 | 15.4988 | avg12 | 2.2E-05 |
wk13 | 10.8888 | avg13 | 2.3E-06 |
wk14 | 7.9475 | avg14 | 2E-07 |
wk15 | 9.81188 | avg15 | 1.53E-08 |
wk16 | 11.3281 | avg16 | 1.13E-09 |
wk17 | 16.7944 | avg17 | 7.95E-11 |
wk18 | 21.3356 | avg18 | 5.48E-12 |
wk19 | 11.0613 | avg19 | 3.64E-13 |
wk20 | 13.3138 | avg20 | 2.09E-14 |
wk21 | 12.1488 | avg21 | 1.15E-15 |
wk22 | 11.0613 | avg22 | 6.00E-17 |
wk23 | 24.0344 | avg23 | 2.94E-18 |
wk24 | 29.5988 | avg24 | 1.49E-19 |
wk25 | 21.4113 | avg25 | 7.37E-21 |
wk26 | 14.6531 | avg26 | 3.29E-22 |
wk27 | 14.7 | avg27 | 1.36E-23 |
wk28 | 18.7425 | avg28 | 5.40E-25 |
wk29 | 15.8988 | avg29 | 2.10E-26 |
wk30 | 13.8875 | avg30 | 7.77E-28 |
wk31 | 19.5225 | avg31 | 2.75E-29 |
wk32 | 17.2169 | avg32 | 9.60E-31 |
wk33 | 12.4269 | avg33 | 3.21E-32 |
wk34 | 16.275 | avg34 | 1.02E-33 |
wk35 | 22.9125 | avg35 | 3.19E-35 |
wk36 | 20.1725 | avg36 | 9.88E-37 |
wk37 | 12.145 | avg37 | 2.94E-38 |
data want;
array wk{37};
array avg{37};
set wide;
do i = 1 to dim(wk);
do j = 1 to i-1;
avg{i} = sum(avg{i}, wk{j} * j);
end;
if i > 1 then avg{i} = avg{i} / fact(i-1);
end;
drop i j;
run;
Looks about right to me. Fact(n) increases very quickly with n. I would guess that what you really want is to divise the weighted sum by the sum of the weights. Something like :
data want;
array wk{37};
array avg{37};
set have;
do i = 1 to dim(wk);
w = 0;
do j = 1 to i-1;
avg{i} = sum(avg{i}, wk{j} * j);
w = w + j;
end;
if i > 1 then avg{i} = avg{i} / w;
end;
drop i j w;
run;
Yes thank you!
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9.
Lock in the best rate now before the price increases on April 1.
Need to connect to databases in SAS Viya? SAS’ David Ghan shows you two methods – via SAS/ACCESS LIBNAME and SAS Data Connector SASLIBS – in this video.
Find more tutorials on the SAS Users YouTube channel.