Consider following code:
data test; do i = 1 to 100; if i=1 then do x=0.5; end; else do; x=lag(x)+0.1; end; output; end; run;
I expect the result to be 0.5 0.6 0.7 0.8
But the result is: 0.5 . 0.6 . 0.7 . 0.8
There are missing values. What happened here?
----------------------------------------------------------------------------------------------
Finally I thought maybe I have to use IML with some more lines of code:
proc iml;
x=J(100,1,0.5);
e=J(100,1);
call randseed(1);
call randgen(e, "normal");
do i=3 to 100;
x[i]=0.3*x[i-1]+0.4*x[i-2]+e[i];
end;
create test from x [colnames=("x")];
append from x;
close test;
quit;
The LAG() function cannot return values you never passed into it. Unless you want to do something very "creative" you should never execute LAG() or DIF() conditionally.
But in this case there is no need to "lag" the value of X. Since you are running a loop in a single iteration of the data step the "old" value of X is still there.
data test;
do i = 1 to 100;
if i=1 then do
x=0.5;
end;
else do;
x=x+0.1;
end;
output;
end;
run;
Or more simply.
data test;
x=0.5;
do i = 1 to 100;
output;
x + 0.1;
end;
run;
The LAG() function cannot return values you never passed into it. Unless you want to do something very "creative" you should never execute LAG() or DIF() conditionally.
But in this case there is no need to "lag" the value of X. Since you are running a loop in a single iteration of the data step the "old" value of X is still there.
data test;
do i = 1 to 100;
if i=1 then do
x=0.5;
end;
else do;
x=x+0.1;
end;
output;
end;
run;
Or more simply.
data test;
x=0.5;
do i = 1 to 100;
output;
x + 0.1;
end;
run;
What if I need previous some rows, e.g. x(i-2), x(i-3)?
@Kilasuelika wrote:
What if I need previous some rows, e.g. x(i-2), x(i-3)?
Stick the value into an array an use as needed.
data test; x=0.5; array y(100); do i = 1 to 100; y[i]=x; output; x + 0.1; end; run;
If you need to reference the value of x from 2 iterations previous that would be use y[i-2] (Caution: only when i is 3 or greater, this doesn't define a y[0] or y[-2] as valid indexes for the array.
I use Y because you cannot have an array name the same as an existing variable.
array y[100];
will generate 100 variables y1, y2, y3... which is not i want. I only need a single variable x.
Do I have to use IML to achieve my goal?
@Kilasuelika wrote:
What if I need previous some rows, e.g. x(i-2), x(i-3)?
Then show an example that actually has an input dataset that has multiple observations so that there will actually be multiple "rows".
Here is method to create three lagged copies of X and make sure that values from a previous group do not bleed into the current group.
data want;
set have;
by id;
lagx1 = lag1(x);
lagx2 = lag2(x);
lagx3 = lag3(x);
array lagx [3];
if first.id then row=1;
else row+1;
do index= row+1 to dim(lagx);
lagx[index]=.;
end;
run;
I didn't mean creating lagged variables from existing data. Consider an AR(1) model:
x[1]=0.5, x[i]=x[i-1]+x[i-2]+rand()
At each loop, generate a random value and add it to the previous values. There is only a single variable.
@Kilasuelika wrote:
I didn't mean creating lagged variables from existing data. Consider an AR(1) model:
x[1]=0.5, x[i]=x[i-1]+x[i-2]+rand()
At each loop, generate a random value and add it to the previous values. There is only a single variable.
You will have to provide a more complete example to explain what is is you are trying to do.
What you are showing sounds like:
data want;
x=0.5 ;
do i=1 to 10;
output;
lag1=x;
lag2=lag(x);
rand=rand('uniform');
x=sum(of lag1 lag2 rand);
end;
run;
Obs x i lag1 lag2 rand 1 0.5000 1 . . . 2 0.5964 2 0.5000 . 0.09637 3 2.0343 3 0.5964 0.5000 0.93795 4 3.2650 4 2.0343 0.5964 0.63433 5 5.8931 5 3.2650 2.0343 0.59379 6 9.8466 6 5.8931 3.2650 0.68843 7 16.4166 7 9.8466 5.8931 0.67684 8 27.1344 8 16.4166 9.8466 0.87119 9 44.5086 9 27.1344 16.4166 0.95770 10 72.3635 10 44.5086 27.1344 0.72051
Which you can do without the LAG() function by just reversing the order of the assignment statements.
lag2=lag1;
lag1=x;
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.