BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
ammarhm
Lapis Lazuli | Level 10

Hi everyone,

I have a question regarding changing short to long data frame and splitting dates.

Here is the initial data frame:

 

Patient Initials age Entry Exit Event1 Event2 Weight
K89R AB 34 27-Jul-10 17-Aug-13 1-Mar-08 2-May-11 44
S33T ES 55 23-Jul-09 12-Mar-12 17-Sep-10 . 76

 

What you see here, is that there is an entry and exit dates, with dates for the events 1 and 2, there is also a missing date for event 2 for the second patient because the event didn't happen. Also note that the event1 for the first patient happened before entry date. 

 

 

What I want to get is the following file:

 

Patient Initials age Entry Exit Event1 Event2 Weight
K89R AB 34 27/07/2010 31/12/2010 1 0 44
K89R AB 34 1/01/2011 31/12/2011 1 1 44
K89R AB 34 1/01/2012 31/12/2012 1 1 44
K89R AB 34 1/01/2013 17/08/2013 1 1 44
S33T ES 55 23/07/2009 31/12/2009 0 0 76
S33T ES 55 1/01/2010 31/12/2010 1 0 76
S33T ES 55 1/01/2011 31/12/2011 1 0 76
S33T ES 55 1/01/2012 12/03/2012 1 0 76

 

What you notice here is that the entry to exit date period is split into individual rows per patient, each representing a year. The event columns are now coded as 0 (meaning the event has not yet happened) or 1 (the event happened) which is then carried over to the years after because the event has already happened.

The age increases in every row per patient as time progresses

The patient ID and initial remain the same as well as the weight.

Could anyone please help with this, thank you

 

 

1 ACCEPTED SOLUTION

Accepted Solutions
Kurt_Bremser
Super User

Solved with a do until and using the intnx() function:

data have;
input patient :$4. entry :yymmdd10. exit :yymmdd10. event1 :yymmdd10. event2 :yymmdd10.;
format entry exit event1 event2 yymmddd10.;
datalines;
K98R 2010-07-27 2013-08-17 2008-03-01 2011-05-02
S33T 2009-07-23 2012-03-12 2010-09-17 .
;

data want;
set have (rename=(exit=_exit event1=_event1 event2=_event2));
format exit yymmddd10.;
exit = min(intnx('year',entry,0,'e'),_exit);
do until (exit = _exit);
  if _event1 ne . and _event1 <= exit then event1 = 1; else event1 = 0;
  if _event2 ne . and _event2 <= exit then event2 = 1; else event2 = 0;
  output;
  entry = intnx('year',entry,1,'b');
  exit = min(intnx('year',exit,1,'e'),_exit);
end;
output;
keep patient entry exit event1 event2;
run;

proc print data=want noobs;
run;

Result:

patient         entry          exit    event1    event2

 K98R      2010-07-27    2010-12-31       1         0  
 K98R      2011-01-01    2011-12-31       1         1  
 K98R      2012-01-01    2012-12-31       1         1  
 K98R      2013-01-01    2013-08-17       1         1  
 S33T      2009-07-23    2009-12-31       0         0  
 S33T      2010-01-01    2010-12-31       1         0  
 S33T      2011-01-01    2011-12-31       1         0  
 S33T      2012-01-01    2012-03-12       1         0  

View solution in original post

3 REPLIES 3
Kurt_Bremser
Super User

Solved with a do until and using the intnx() function:

data have;
input patient :$4. entry :yymmdd10. exit :yymmdd10. event1 :yymmdd10. event2 :yymmdd10.;
format entry exit event1 event2 yymmddd10.;
datalines;
K98R 2010-07-27 2013-08-17 2008-03-01 2011-05-02
S33T 2009-07-23 2012-03-12 2010-09-17 .
;

data want;
set have (rename=(exit=_exit event1=_event1 event2=_event2));
format exit yymmddd10.;
exit = min(intnx('year',entry,0,'e'),_exit);
do until (exit = _exit);
  if _event1 ne . and _event1 <= exit then event1 = 1; else event1 = 0;
  if _event2 ne . and _event2 <= exit then event2 = 1; else event2 = 0;
  output;
  entry = intnx('year',entry,1,'b');
  exit = min(intnx('year',exit,1,'e'),_exit);
end;
output;
keep patient entry exit event1 event2;
run;

proc print data=want noobs;
run;

Result:

patient         entry          exit    event1    event2

 K98R      2010-07-27    2010-12-31       1         0  
 K98R      2011-01-01    2011-12-31       1         1  
 K98R      2012-01-01    2012-12-31       1         1  
 K98R      2013-01-01    2013-08-17       1         1  
 S33T      2009-07-23    2009-12-31       0         0  
 S33T      2010-01-01    2010-12-31       1         0  
 S33T      2011-01-01    2011-12-31       1         0  
 S33T      2012-01-01    2012-03-12       1         0  
Ksharp
Super User
data have;
input patient :$4. entry :yymmdd10. exit :yymmdd10. event1 :yymmdd10. event2 :yymmdd10.;
format entry exit event1 event2 yymmdd10.;
datalines;
K98R 2010-07-27 2013-08-17 2008-03-01 2011-05-02
S33T 2009-07-23 2012-03-12 2010-09-17 .
;
data temp;
 set have;
 id+1;
 do date=entry to exit;
  year=year(date);
  flag1=(date>=event1 and not missing(event1));
  flag2=(date>=event2 and not missing(event2));
  output;
 end;
run;
proc sql;
create table want as
 select id,patient,min(date) as entry format=yymmdd10.,
 max(date) as exit format=yymmdd10.,
 max(flag1) as event1,
 max(flag2) as event2
  from temp
   group by id,patient,year;
quit;
ammarhm
Lapis Lazuli | Level 10

Thank you @Ksharp  and @Kurt_Bremser 

Both solutions work like a charm.

 

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 3 replies
  • 567 views
  • 1 like
  • 3 in conversation