Hi all,
I'm a sas beginner struggling quite a bit with retain.
I have daily data where when a specific condition is met, I want to preserve that value for the subsequent two days, so in essence create the following value_lagged variable.
Date Condition Value Value_lagged
1/1/2005 1 5 0
1/2/2005 0 0 5
1/3/2005 0 0 5
1/4/2005 0 2 0
1/5/2005 0 3 0
1/6/2005 0 0 0
data have;
input Date:mmddyy10. Condition Value;
format Date mmddyy10.;
datalines;
1/1/2005 1 5
1/2/2005 0 0
1/3/2005 0 0
1/4/2005 0 2
1/5/2005 0 3
1/6/2005 0 0
;
data want;
set have;
Value_lagged=0;
lag1Condition=lag1(Condition);
lag2Condition=lag2(Condition);
lag1Value=lag1(Value);
lag2Value=lag2(Value);
if lag1Condition=1 then do;
Value_lagged=lag1Value;
end;
if lag2Condition=1 then do;
Value_lagged=lag2Value;
end;
drop lag:;
run;
data have;
input Date:mmddyy10. Condition Value;
format Date mmddyy10.;
datalines;
1/1/2005 1 5
1/2/2005 0 0
1/3/2005 0 0
1/4/2005 0 2
1/5/2005 0 3
1/6/2005 0 0
;
data want;
set have;
Value_lagged=0;
lag1Condition=lag1(Condition);
lag2Condition=lag2(Condition);
lag1Value=lag1(Value);
lag2Value=lag2(Value);
if lag1Condition=1 then do;
Value_lagged=lag1Value;
end;
if lag2Condition=1 then do;
Value_lagged=lag2Value;
end;
drop lag:;
run;
Try this:
data want;
set have;
retain count 99 value_lagged;
output; * this output is necessary because we don't want value_lagged set immediately;
if condition then do;
count = 0;
value_lagged = value;
end;
count + 1;
if count > 3 then value_lagged = 0;
drop count;
run;
Edit: added drop statement.
This is a case where LAG functions embedded in an IFN function works neatly:
data have;
input Date:mmddyy10. Condition Value;
format Date mmddyy10.;
datalines;
1/1/2005 1 5
1/2/2005 0 0
1/3/2005 0 0
1/4/2005 0 2
1/5/2005 0 3
1/6/2005 0 0
run;
data want;
set have;
value_lagged=ifn(lag(condition)=1 or lag2(condition)=1
,max(lag(value),lag2(value))
,0);
run;
This program assumes that for all records in which condition is not a 1, value is a zero. Otherwise the MAX function might choose the wrong lagged value.
Or:
data WANT ;
set HAVE;
VALUE_LAGGED=ifn(lag1(CONDITION), lag1(VALUE)
,ifn(lag2(CONDITION), lag2(VALUE)
, 0 ));
run ;
@ChrisNZ's is an improvement over my suggestion, in that it handles consecutive CONDITION=1 cases more appropriately. Namely it takes the most recent VALUE of the two qualifying records, not the maximum valuje.
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.