I'm trying to keep measurement dates as a new index date if the measurement date is more than 180 days after the last saved index date (i.e., the last measurement date that met the >180 days condition). I'm having trouble iterating this condition every row by the ID group.
I have this data:
data have;
input id measure_dt :mmddyy. measure_cnt;
format measure_dt mmddyy10.;
datalines;
1 1/22/2008 1
1 4/21/2008 2
1 8/4/2008 3
1 9/17/2008 4
2 5/2/2009 1
2 11/22/2009 2
2 1/9/2010 3
;
run;
I want this:
id | measure_dt | measure_cnt | days since last index date | new index date |
1 | 1/22/2008 | 1 | . | 1/22/2008 |
1 | 4/21/2008 | 2 | 90 | . |
1 | 8/4/2008 | 3 | 195 | 8/4/2008 |
1 | 9/17/2008 | 4 | 44 | . |
2 | 5/2/2009 | 1 | . | 5/2/2009 |
2 | 11/22/2009 | 2 | 204 | 11/22/2009 |
2 | 1/9/2010 | 3 | 48 | . |
I've tried the below code which is getting closer to what I want, but I think there are issues with my do loop as it's outputting multiple rows per each observation that I have.
data want;
set have;
by id;
if first.id then do;
new_index_dt=measure_dt;
lag_index=measure_dt;
output;
end;
lag_index=lag(new_index_dt);
if first.id then lag_index=measure_dt;
do measure_cnt=2 to max(measure_cnt);
days_since_last=measure_dt-lag_index;
if days_since_last>=180 then new_index_dt=measure_dt;
else if days_since_last<180 then new_index_dt=lag_index;
lag_index=lag(new_index_dt);
output;
end;
format new_index_dt lag_index mmddyy10.;
run;
I'm using SAS EG 8.3
A simple RETAIN should do the trick.
data have;
input id measure_dt :yymmdd. measure_cnt;
format measure_dt yymmdd10.;
datalines;
1 2008-01-22 1
1 2008-04-21 2
1 2008-08-04 3
1 2008-09-17 4
2 2009-05-02 1
2 2009-11-22 2
2 2010-01-09 3
;
data want;
set have;
by id;
if first.id then index_dt = measure_dt;
retain index_dt;
format index_dt yymmdd10.;
days=measure_dt - index_dt;
if days > 180 then do;
index_dt=measure_dt;
end;
run;
Result
measure_ Obs id measure_dt cnt index_dt days 1 1 2008-01-22 1 2008-01-22 0 2 1 2008-04-21 2 2008-01-22 90 3 1 2008-08-04 3 2008-08-04 195 4 1 2008-09-17 4 2008-08-04 44 5 2 2009-05-02 1 2009-05-02 0 6 2 2009-11-22 2 2009-11-22 204 7 2 2010-01-09 3 2009-11-22 48
I don't understand the missing values in your expected output. Why would you want that? If you want that you will have to make another variable in addition to the one you use to remember (retain) the index date.
A simple RETAIN should do the trick.
data have;
input id measure_dt :yymmdd. measure_cnt;
format measure_dt yymmdd10.;
datalines;
1 2008-01-22 1
1 2008-04-21 2
1 2008-08-04 3
1 2008-09-17 4
2 2009-05-02 1
2 2009-11-22 2
2 2010-01-09 3
;
data want;
set have;
by id;
if first.id then index_dt = measure_dt;
retain index_dt;
format index_dt yymmdd10.;
days=measure_dt - index_dt;
if days > 180 then do;
index_dt=measure_dt;
end;
run;
Result
measure_ Obs id measure_dt cnt index_dt days 1 1 2008-01-22 1 2008-01-22 0 2 1 2008-04-21 2 2008-01-22 90 3 1 2008-08-04 3 2008-08-04 195 4 1 2008-09-17 4 2008-08-04 44 5 2 2009-05-02 1 2009-05-02 0 6 2 2009-11-22 2 2009-11-22 204 7 2 2010-01-09 3 2009-11-22 48
I don't understand the missing values in your expected output. Why would you want that? If you want that you will have to make another variable in addition to the one you use to remember (retain) the index date.
Below code returning zero instead of missings and repeating the values for the new index date - but it wouldn't be hard to add a bit of if/then logic to create the missings instead.
data have;
input id measure_dt :mmddyy. measure_cnt;
format measure_dt mmddyy10.;
datalines;
1 1/22/2008 1
1 4/21/2008 2
1 8/4/2008 3
1 9/17/2008 4
2 5/2/2009 1
2 11/22/2009 2
2 1/9/2010 3
;
run;
data want;
set have;
by id measure_dt;
attrib days_last_ind label='days since last index date'
new_ind_date lable='new index date' format=mmddyy10.
;
retain new_ind_date;
if first.id then new_ind_date=measure_dt;
days_last_ind=measure_dt-new_ind_date;
if days_last_ind>180 then new_ind_date=measure_dt;
run;
proc print data=want;
run;
data have;
input id measure_dt :mmddyy. measure_cnt;
format measure_dt mmddyy10.;
datalines;
1 1/22/2008 1
1 4/21/2008 2
1 8/4/2008 3
1 9/17/2008 4
2 5/2/2009 1
2 11/22/2009 2
2 1/9/2010 3
;
run;
data want;
set have;
by id;
retain new_index_date ;
if first.id then new_index_date=measure_dt ;
else days_since_last_index_date=measure_dt-new_index_date;
if days_since_last_index_date>180 then new_index_date=measure_dt ;
format new_index_date mmddyy10.;
run;
For each qualifying observation, generate a _cutoff_date (retained), against which subsequent measure_dt values will be tested:
data have;
input id measure_dt :mmddyy. measure_cnt;
format measure_dt mmddyy10.;
datalines;
1 1/22/2008 1
1 4/21/2008 2
1 8/4/2008 3
1 9/17/2008 4
2 5/2/2009 1
2 11/22/2009 2
2 1/9/2010 3
run;
data want (drop=_:);
set have;
by id;
retain _cutoff_date;
if first.id=1 or measure_dt>_cutoff_date then do;
new_index_date=measure_dt;
_cutoff_date=measure_dt+180;
end;
format new_index_date mmddyy10.;
run;
This replicates the results you showed. But if you want to keep the established new_index_date for records within 180 dates, just add new_index_date to the RETAIN statement.
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.