data lin;
input x y;
datalines;
1 3
4 5
2 .
5 .
6 .
7 .
;
run;
dear all,I want to update the missing variable y . The rules are below:
(1*3+4*5)/(1+4)=4.6 as the 3rd obs
(4*5+2*4.6)/(4+2)=4.87 as the 4rth obs
........
the expected result is
I do the job like the follows ,but It looks a little silly. is there a simple way or other method ?
%macro macro_data3();
data _null_;
set lin nobs=nobs;
call symput("nobs",nobs);
run;
/*%put &nobs.;*/
data lin3;
set lin;
/*if _n_ in (1,2) then z=y;*/
%do i=1 %to &nobs.;
z=(lag(y)*lag(x)+lag2(y)*lag2(x))/(lag(x)+lag2(x));
if y eq . then y=z;
%end;
drop z;
run;
%mend;
%macro_data3();
ANY suggestion is welcome! Thanks in advance!
Here is one way
data lin;
input x y;
datalines;
1 3
4 5
2 .
5 .
6 .
7 .
;
data want(drop=obs);
array _x[0:1] _temporary_;
array _y[0:1] _temporary_;
do obs=1 by 1 until (lr);
set lin end=lr;
if y=. then y=sum(_x[0]*_y[0], _x[1]*_y[1])/sum(of _x[*]);
output;
_x[mod(obs, 2)]=x;
_y[mod(obs, 2)]=y;
end;
run;
Here is one way
data lin;
input x y;
datalines;
1 3
4 5
2 .
5 .
6 .
7 .
;
data want(drop=obs);
array _x[0:1] _temporary_;
array _y[0:1] _temporary_;
do obs=1 by 1 until (lr);
set lin end=lr;
if y=. then y=sum(_x[0]*_y[0], _x[1]*_y[1])/sum(of _x[*]);
output;
_x[mod(obs, 2)]=x;
_y[mod(obs, 2)]=y;
end;
run;
Another variation in using Array as given by @PeterClemmensen .
data want;
set lin;
array kx[7] _temporary_;
array ky[7] _temporary_;
kx[_n_] = x;
ky[_n_] = y;
yy = y;
if missing(ky[_n_]) then do;
ky[_n_] = (kx[_n_ - 1] * ky[_n_ - 1] + kx[_n_ - 2] * ky[_n_ - 2]) /
(kx[_n_ - 1] + kx[_n_ - 2]) ;
if missing(y) then yy = ky[_n_];
end;
run;
Edited: The following statement can be rewritten
if missing(y) then yy = ky[_n_];
as
yy = ky[_n_];
The array-size can be made dynamic by getting the number of observations as a macro variable from outside this data step.
thank you @KachiM ! I thought about it before, but you did it. the way is just like transposing the data.
Thank YOU! You inspired me。
AS a trainee i could do the job like this.:
Hi @duanzongran Nice fun puzzle, sets of 2 (window length 2) is pretty straight forward as we can afford to type and assign lag1 and lag2. Sorry missed the party earlier.
data lin;
input x y;
datalines;
1 3
4 5
2 .
5 .
6 .
7 .
;
run;
data want;
do _n_=1,2 until(z);
set lin end=z;
array t(2) _temporary_ ;
_l1=lag(x);
_l2=lag2(x);
if y=. then y=sum(of t(*))/sum(of _l:);
t(_n_)=y*x;
output;
end;
drop _:;
run;
Thank you @novinosrin ! As a trainee I have leart a lot from your code!
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.