BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
qbui
Calcite | Level 5

Hi all,

 

I have the data like this:

 

data have;
infile datalines delimiter=',';
attrib
id length=$1
visit_date format=date9. informat=date9.
death_date format=date9. informat=date9.
;
input id $ visit_date death death_date disease;
*format id best. visit_date death_date date9.;
datalines;
1,24oct2012,0,.,0
1,24oct2013,0,.,0
1,23oct2014,0,.,0
1,26oct2015,0,.,0
1,23oct2016,0,.,1
1,28oct2017,0,.,1
2,20dec2007,1,15jun2018,0
2,17sep2008,1,15jun2018,0
2,20dec2009,1,15jun2018,0
2,16sep2010,1,15jun2018,1
2,20dec2011,1,15jun2018,1
2,22sep2012,1,15jun2018,1
;
run;

 

From this longitudinal dataset, I want to generate the dataset to run Cox model with competing risk. More specifically, I want to create the dataset that satisfy the following:

- if from the first visit to the last visit, subject does not have the event (disease=0), then time_to_event= duration from the first visit to the last visit and create variable called censor_status=0.

- if from the first visit to the last visit, subject has the event in between (diease=1), then time_to_event =duration from first visit to the visit_date when even happen (aka disease from 0 to 1) and censor_status=1.

- if death_date occurs before the event happens then time_to_event = duration from first visit to death date, and censor_status=2.

 

Any insight is greatly appreciated.

 

Thanks

 

 

1 ACCEPTED SOLUTION

Accepted Solutions
PaigeMiller
Diamond | Level 26

You have 3 conditions specified, but only provide two IDs. We can't be sure we have programmed all three conditions because we don't have data for the third condition. In the future, we need data that represents all reasonable situations that might happen. However, you can test it out on other conditions to see if it works.

 

Also, I did not compute the variable censor_status and I leave it as a homework problem for you to do so, given the logic of the program, I don't think it will be hard.

 

data want;
    set have;
    by id;
    retain start_date end_date;
    if first.id then do;
        start_date=visit_date;
        end_date=.;
    end;
    if disease=1 and missing(end_date) then end_date=visit_date;
    if last.id and missing(end_date) then end_date=visit_date;
    if last.id and not missing(death_date) then end_date=death_date;
    if last.id then do;
        time_to_event=end_date-start_date;
        output;
    end;
run;
--
Paige Miller

View solution in original post

3 REPLIES 3
Kurt_Bremser
Super User

Before you proceed, you should first learn the proper use of informats in INPUT statements.

Right now, your data step does not work:

 69         data have;
 70         input id visit_date death_date status;
 71         cards;
 
 NOTE: Invalid data for visit_date in line 72 3-11.
 REGEL:     ----+----1----+----2----+----3----+----4----+----5----+----6----+----7----+----8----+----9----+----0                     
 72         1 01jan2000 . 0
 id=1 visit_date=. death_date=. status=0 _ERROR_=1 _N_=1

(multiple NOTEs for invalid data removed).

Since you use List INPUT , you have to use the colon modifier when using informats. From the look of your data, you should use the DATE9. informat.

Please fix your data step, so we can proceed in helping you further.

qbui
Calcite | Level 5
Hi Kurt_Bremser, Thank you for your reply. I just corrected the data step. I hope it works now. Your opinion is greatly appreciated.
Thanks
PaigeMiller
Diamond | Level 26

You have 3 conditions specified, but only provide two IDs. We can't be sure we have programmed all three conditions because we don't have data for the third condition. In the future, we need data that represents all reasonable situations that might happen. However, you can test it out on other conditions to see if it works.

 

Also, I did not compute the variable censor_status and I leave it as a homework problem for you to do so, given the logic of the program, I don't think it will be hard.

 

data want;
    set have;
    by id;
    retain start_date end_date;
    if first.id then do;
        start_date=visit_date;
        end_date=.;
    end;
    if disease=1 and missing(end_date) then end_date=visit_date;
    if last.id and missing(end_date) then end_date=visit_date;
    if last.id and not missing(death_date) then end_date=death_date;
    if last.id then do;
        time_to_event=end_date-start_date;
        output;
    end;
run;
--
Paige Miller

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 3 replies
  • 488 views
  • 2 likes
  • 3 in conversation