BookmarkSubscribeRSS Feed
withpeople86
Calcite | Level 5
idyeardeathydeathmdeathd
12010   
12011   
12012   
12013   
12014   
120152015317
22010   
22011   
22012   
22013   
22014   
22015   
22016   
22017   
32010   
32011201131

 

If I have data looking like this, to conduct cox analysis should I make a table looking like below?

Also, what codes should I use to make a table like this?

 

idyeardeathydeathmdeathdstartfinish
12010   0365
12011   366730
12012   7311096
12013   10971462
12014   14631828
120152015317182977.83333
22010   0365
22011   366730
22012   7311096
22013   10971462
22014   14631828
22015   18292194
22016   21952560
22017   25612926
32010   0365
3201120113136661.83333

 

Thanks in advance.

5 REPLIES 5
Astounding
PROC Star

Here are some features about your plan that stand out as questionable.

 

Why should the start for each ID be at 0 and not at 1?  It's inconsistent to go from 0 to 365 for year 1, but 366 to 730 for year 2.

 

Why use fractional days?  SAS is perfectly capable of counting days from January 1 to March 17.

 

Why should the final FINISH value be lower than the final START value?  Should we be adding those together to get the right FINISH?

 

I'm not claiming to have answers to these questions, just pointing out that these issues are red flags.

 

The programming isn't that difficult, but designing the target result takes some work.

 

 

withpeople86
Calcite | Level 5

I am sorry that I made some mistakes and your comment is correct.

I'm attaching the new table here. Is this correct now?

idyeardeathydeathmdeathdstartfinish
12010   1365
12011   366730
12012   7311096
12013   10971462
12014   14631828
12015201531718291907
22010   1365
22011   366730
22012   7311096
22013   10971462
22014   14631828
22015   18292194
22016   21952560
22017   25612926
32010   1365
32011201131366428
Astounding
PROC Star

That looks better.  I can't really advise you about the analysis, but I can give you some direction about creating the START and FINISH variables.  Try it this way:

 

proc sort data=have;
   by id year;
run;

data want;
   set have;
   by id year;
   retain start finish;;
   days_in_current_feb = day(mdy(3, 1, year) - 1);
   if first.id then do;
      start = 1;
      finish = 365 - 28 + days_in_current_feb;
      if deathy > . then finish = mdy(deathm, deathd, deathy) -
                 intnx('year', mdy(1, 1, year), 0) + 1;
   end;
   else do;
      start + 365 - 28 + days_in_current_feb;
      if deathy = . then finish + 365 - 28 + days_in_current_feb;
      else finish + mdy(deathm, deathd, deathy) - start;
   end;
run;

FOR THE MOMENT:  Sorry, this is incomplete.  Once my brain turned to mush, I needed a break.  I will get back to it later to fix the rest of it.

Astounding
PROC Star

OK, here's the retry:

 

proc sort data=have;
   by id year;
run;

data want;
   set have;
   by id year;
   retain start finish;
   days_in_current_year = mdy(1, 1, year+1) - mdy(1, 1, year);
   if first.id then do;
      start = 1;
      finish = days_in_current_year;
   end;
   else do;
      start = finish + 1;
      finish = finish + days_in_current_year;
   end;
   if deathy > . then do;
      finish = finish - days_in_current_year;
      finish = finish + mdy(deathm, deathd, deathy) - mdy(1, 1, year) + 1;
   end;
run;

It looks about right, but test it to see if the final calculation for a death date needs to be adjusted by a day. 

ScottBass
Rhodochrosite | Level 12

Edit:  I posted this without first seeing Astounding's solution.  Our approaches are virtually identical.

 

I thought this would be fun...this is what I came up with:

 

data have;
   length id year deathy deathm deathd 8;
   infile datalines truncover;
   input id year deathy deathm deathd;
   datalines;
1   2010            
1   2011            
1   2012            
1   2013            
1   2014            
1   2015   2015   3   17
2   2010            
2   2011            
2   2012            
2   2013            
2   2014            
2   2015            
2   2016            
2   2017            
3   2010            
3   2011   2011   3   1
;
run;

data want;
   length id year deathy deathm deathd start finish 8;
   infile datalines truncover;
   input id year deathy deathm deathd start finish;
   datalines;
1  2010     .  .  .     1    365
1  2011     .  .  .   366    730
1  2012     .  .  .   731   1096
1  2013     .  .  .  1097   1462
1  2014     .  .  .  1463   1828
1  2015  2015  3  17 1829   1907
2  2010     .  .  .     1    365
2  2011     .  .  .   366    730
2  2012     .  .  .   731   1096
2  2013     .  .  .  1097   1462
2  2014     .  .  .  1463   1828
2  2015     .  .  .  1829   2194
2  2016     .  .  .  2195   2560
2  2017     .  .  .  2561   2926
3  2010     .  .  .     1    365
3  2011  2011  3  1   366    428
;
run;

data test;
   set have;
   by id year;
   * derive date from year ;
   date=mdy(1,1,year);
   * derive number of days in the year ;
   num_days=intck('day',intnx('year',date,0,'B'),intnx('year',date,1,'B'));
   * if death, derive number of days to death ;
   if nmiss(deathy,deathm,deathd)=0 then
   num_days=intck('day',intnx('year',date,0,'B'),mdy(deathm,deathd,deathy));
   * reset if first ID ;
   if first.id then do;
      start=1;
      finish=num_days;
   end;
   else do;
      start=finish+1;
      finish+num_days;  * automatically retained ;
   end;
   drop date;
*   drop num_days;
run;

proc compare base=want comp=test;
run;

However, it doesn't exactly match your desired output.  Perhaps you can run with this and fix the code?  Or perhaps your desired output is slightly off?

 

Hope this helps...

 

P.S.: In the future, can you post your problems with a self-contained data step containing datalines for your data?  This will save us from having to cut-and-paste your data into a working data step.


Please post your question as a self-contained data step in the form of "have" (source) and "want" (desired results).
I won't contribute to your post if I can't cut-and-paste your syntactically correct code into SAS.

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 5 replies
  • 740 views
  • 0 likes
  • 3 in conversation