BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
rachellej
Calcite | Level 5

I'm trying to keep measurement dates as a new index date if the measurement date is more than 180 days after the last saved index date (i.e., the last measurement date that met the >180 days condition).  I'm having trouble iterating this condition every row by the ID group. 

I have this data:

 

data have;
input id measure_dt :mmddyy. measure_cnt;
format measure_dt mmddyy10.;
datalines;
1 1/22/2008 1
1 4/21/2008 2
1 8/4/2008 3
1 9/17/2008 4
2 5/2/2009 1
2 11/22/2009 2
2 1/9/2010 3
;
run;

 

I want this:

idmeasure_dtmeasure_cntdays since last index datenew index date
11/22/20081.1/22/2008
14/21/2008290.
18/4/200831958/4/2008
19/17/2008444.
25/2/20091.5/2/2009
211/22/2009220411/22/2009
21/9/2010348.

 

I've tried the below code which is getting closer to what I want, but I think there are issues with my do loop as it's outputting multiple rows per each observation that I have. 

data want;			
	set have;		
	by id;			
	if first.id then do; 		
		new_index_dt=measure_dt;	
		lag_index=measure_dt;	
		output;	
	end;		
	
	lag_index=lag(new_index_dt);		
	if first.id then lag_index=measure_dt;		
			
	do measure_cnt=2 to max(measure_cnt);		
		days_since_last=measure_dt-lag_index;	
		if days_since_last>=180 then new_index_dt=measure_dt;	
			else if days_since_last<180 then new_index_dt=lag_index;
		lag_index=lag(new_index_dt);	
		output;	
	end;		
	format new_index_dt lag_index mmddyy10.;		
run;

I'm using SAS EG 8.3

 

 

1 ACCEPTED SOLUTION

Accepted Solutions
Tom
Super User Tom
Super User

A simple RETAIN should do the trick.

data have;
  input id measure_dt :yymmdd. measure_cnt;
  format measure_dt yymmdd10.;
datalines;
1 2008-01-22 1
1 2008-04-21 2
1 2008-08-04 3
1 2008-09-17 4
2 2009-05-02 1
2 2009-11-22 2
2 2010-01-09 3
;

data want;
  set have;
  by id;
  if first.id then index_dt = measure_dt;
  retain index_dt;
  format index_dt yymmdd10.;
  days=measure_dt - index_dt;
  if days > 180 then do;
     index_dt=measure_dt;
  end;
run;

Result

                           measure_
Obs    id    measure_dt       cnt        index_dt    days

 1      1    2008-01-22        1       2008-01-22       0
 2      1    2008-04-21        2       2008-01-22      90
 3      1    2008-08-04        3       2008-08-04     195
 4      1    2008-09-17        4       2008-08-04      44
 5      2    2009-05-02        1       2009-05-02       0
 6      2    2009-11-22        2       2009-11-22     204
 7      2    2010-01-09        3       2009-11-22      48

I don't understand the missing values in your expected output.  Why would you want that?  If you want that you will have to make another variable in addition to the one you use to remember (retain) the index date.

View solution in original post

4 REPLIES 4
Tom
Super User Tom
Super User

A simple RETAIN should do the trick.

data have;
  input id measure_dt :yymmdd. measure_cnt;
  format measure_dt yymmdd10.;
datalines;
1 2008-01-22 1
1 2008-04-21 2
1 2008-08-04 3
1 2008-09-17 4
2 2009-05-02 1
2 2009-11-22 2
2 2010-01-09 3
;

data want;
  set have;
  by id;
  if first.id then index_dt = measure_dt;
  retain index_dt;
  format index_dt yymmdd10.;
  days=measure_dt - index_dt;
  if days > 180 then do;
     index_dt=measure_dt;
  end;
run;

Result

                           measure_
Obs    id    measure_dt       cnt        index_dt    days

 1      1    2008-01-22        1       2008-01-22       0
 2      1    2008-04-21        2       2008-01-22      90
 3      1    2008-08-04        3       2008-08-04     195
 4      1    2008-09-17        4       2008-08-04      44
 5      2    2009-05-02        1       2009-05-02       0
 6      2    2009-11-22        2       2009-11-22     204
 7      2    2010-01-09        3       2009-11-22      48

I don't understand the missing values in your expected output.  Why would you want that?  If you want that you will have to make another variable in addition to the one you use to remember (retain) the index date.

Patrick
Opal | Level 21

Below code returning zero instead of missings and repeating the values for the new index date - but it wouldn't be hard to add a bit of if/then logic to create the missings instead.

data have;
  input id measure_dt :mmddyy. measure_cnt;
  format measure_dt mmddyy10.;
  datalines;
1 1/22/2008 1
1 4/21/2008 2
1 8/4/2008 3
1 9/17/2008 4
2 5/2/2009 1
2 11/22/2009 2
2 1/9/2010 3
;
run;

data want;
  set have;
  by id measure_dt;
  attrib days_last_ind label='days since last index date'
         new_ind_date  lable='new index date' format=mmddyy10.
         ;

  retain new_ind_date;

  if first.id then new_ind_date=measure_dt;
  days_last_ind=measure_dt-new_ind_date;
  if days_last_ind>180 then new_ind_date=measure_dt;
run;

proc print data=want;
run;

Patrick_0-1710899539003.png

 

Ksharp
Super User
data have;
input id measure_dt :mmddyy. measure_cnt;
format measure_dt mmddyy10.;
datalines;
1 1/22/2008 1
1 4/21/2008 2
1 8/4/2008 3
1 9/17/2008 4
2 5/2/2009 1
2 11/22/2009 2
2 1/9/2010 3
;
run;

data want;
 set have;
 by id;
retain new_index_date ;
if first.id then new_index_date=measure_dt ;
 else days_since_last_index_date=measure_dt-new_index_date;

if days_since_last_index_date>180 then new_index_date=measure_dt ;
format new_index_date mmddyy10.;
run;
mkeintz
PROC Star

For each qualifying observation, generate a _cutoff_date (retained), against which subsequent measure_dt values will be tested:

 

data have;
  input id measure_dt :mmddyy. measure_cnt;
  format measure_dt mmddyy10.;
datalines;
1 1/22/2008 1
1 4/21/2008 2
1 8/4/2008 3
1 9/17/2008 4
2 5/2/2009 1
2 11/22/2009 2
2 1/9/2010 3
run;

data want (drop=_:);
  set have;
  by id;
  retain _cutoff_date;
  if first.id=1 or measure_dt>_cutoff_date then do;
    new_index_date=measure_dt;
    _cutoff_date=measure_dt+180;
  end;
  format new_index_date mmddyy10.;
run;

This replicates the results you showed.  But if you want to keep the established new_index_date for records within 180 dates, just add new_index_date to the RETAIN statement.

 

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 4 replies
  • 347 views
  • 0 likes
  • 5 in conversation