Hi Everyone,
For each id, I want to calculate moving average using values not equal to 0.
For the data below with moving average of 3 lag 1, the average for record 4 should take (3+9)/2; record 5 will have 9/1.
The proc expand with WHERE v2>0 is not working that way.
Can you please help?
Thanks,
HHC
data have;
input date id v2;
datalines;
1 1 3
2 1 0
3 1 9
4 1 0
5 1 5
1 2 0
2 2 0
3 2 9
4 2 0
5 2 5
;run;
*NOT right;
proc expand data=have out=want;
where v2>0;
convert v2 = avg/transformout=(MOVave 3 lag 1);
run;
You want the average of the previous three observations, excluding thoses with V2=0. How about a DATA step:
data have;
input date id v2;
datalines;
1 1 3
2 1 0
3 1 9
4 1 0
5 1 5
1 2 0
2 2 0
3 2 9
4 2 0
5 2 5
;run;
data want;
array prev_three {3} _temporary_;
set have;
by id;
if first.id then call missing(of prev_three{*});
avg=mean(of prev_three{*});
prev_three{mod(_n_,3)+1}=ifn(v2>0,v2,.);
run;
The important thing here is that AVG is calculated prior to replacing the 3rd previous value of V2 in the PREV_THREE array with the current value.
Shouldn't it be
convert v2 = avg/transformout=(MOVave 1 lag 3);
convert v2 = avg/transformout=(MOVave 1 lag 3);
will take average of 3 prior cell regardless of cell value, so it will not work.
Thanks.
HHC
Hi,
I do the vertical/Excel kind of way.
It works.
If you have a shorter method, please let me know.
Thanks,
HHC
data have;
input date id value;
datalines;
1 1 3
2 1 0
3 1 9
4 1 0
5 1 5
1 2 1
2 2 0
3 2 9
4 2 0
5 2 5
;run;
proc sort data=have; by id descending date;run;
data want; set have;
drop id1 date1 v1;
SUM=0;
Count=0;
do i=_N_ +1 to _N_+3;
set have (rename = (date=date1 id=id1 value=v1)) point=i;
if id1=id and v1^=0 then do;
sum=sum+v1;
count=count+1;
end;
average=sum/count;
end;
output;
run;
Easiest is very likely 1) add a new variable that is missing when the variable you want to average is missing
2) expand that new variable
3) (may want to remove the added variable)
You want the average of the previous three observations, excluding thoses with V2=0. How about a DATA step:
data have;
input date id v2;
datalines;
1 1 3
2 1 0
3 1 9
4 1 0
5 1 5
1 2 0
2 2 0
3 2 9
4 2 0
5 2 5
;run;
data want;
array prev_three {3} _temporary_;
set have;
by id;
if first.id then call missing(of prev_three{*});
avg=mean(of prev_three{*});
prev_three{mod(_n_,3)+1}=ifn(v2>0,v2,.);
run;
The important thing here is that AVG is calculated prior to replacing the 3rd previous value of V2 in the PREV_THREE array with the current value.
Your code is amazing.
It work much much faster than my code!!!
Thank you,
HHC
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.