DATA Step, Macro, Functions and more

Obtain observation Date where an indicator is first observed

Reply
Regular Learner
Posts: 1

Obtain observation Date where an indicator is first observed

Hello,

 

I am trying to get for each ID, the date at which a 'status' column first becomes either '9','F', or 'R'.

For the example below, i would want ID 1 to show '12/31/2007' and ID 2 '11/30/2007'

 

statIDdateDeliq Hist
C17/31/2007CCCCCCCCCCCC
C18/31/2007CCCCCCCCCCCC
C19/30/2007CCCCCCCCCCCC
3110/31/2007CCCCCCCCCCC3
6111/30/2007CCCCCCCCCC36
9112/31/2007CCCCCCCCC369
F11/31/2008CCCCCCCC369F
F12/29/2008CCCCCCC369FF
C13/31/2008CCCCCC369FFC
C14/30/2008CCCCC369FFCC
C15/31/2008CCCC369FFCCC
C16/30/2008CCC369FFCCCC
C17/31/2008CC369FFCCCCC
C18/31/2008C369FFCCCCCC
C19/30/2008369FFCCCCCCC
C27/31/2007CCCCCCCCCCCC
C28/31/2007CCCCCCCCCCCC
C29/30/2007CCCCCCCCCCCC
C210/31/2007CCCCCCCCCCCC
F211/30/2007CCCCCCCCCCCF
9212/31/2007CCCCCCCCCCF9
Respected Advisor
Posts: 3,156

Re: Obtain observation Date where an indicator is first observed

The previous one is buggy, 

data want;
set have;
by id stat notsorted;
if first.id then n=0;
if first.stat and stat in ('9', 'F', 'R') and id=lag(id) then n+1;
if n=1;
drop n;
run;
Respected Advisor
Posts: 4,919

Re: Obtain observation Date where an indicator is first observed

Simple to do with SQL, taking advantage of auto-remerging

 

proc sql;
select * from have
where stat in ("9","F","R")
group by ID
having date = min(date);
quit;
PG
Respected Advisor
Posts: 4,919

Re: Obtain observation Date where an indicator is first observed

Or if you want to get also IDs that never reached 9 F or R status in your output:

 

proc sql;
select * from 
(select distinct ID from have) 
natural left join
(select * from have
where stat in ("9","F","R")
group by ID
having date = min(date));
quit;
PG
Super User
Posts: 5,497

Re: Obtain observation Date where an indicator is first observed

A handy DATA step technique is to go through the data twice:

 

data want;

   length first_date $ 10;

   do until (last.ID);

      set have;

      by id;

      if first_date=' ' and stat in ('9', 'F', 'R') then first_date=date;

   end;

   do until (last.ID);

      set have;

      by id;

      output;

   end;

run;

 

This version assumes DATE is character, but if it's numeric just change the LENGTH statement (and the check for first_date=' ').

 

Good luck.

Ask a Question
Discussion stats
  • 4 replies
  • 200 views
  • 0 likes
  • 4 in conversation