BookmarkSubscribeRSS Feed
medic
Fluorite | Level 6
Hi,
I'm using a PROC PHREG to manage survival data.
Since explain variable is time-dependent, I coded as:
(start, end)*censor(0)=variable

I also made another dataset by manually calculating the time to event period (tte = end - start) because the original code takes too long to process, and ran the following code:
tte*censor(0)=variable

But the results from those two codes are different, although I think the only difference is that I already calculated the result of "(start, end)".

Can anyone explain why this is happening?

Thanks a lot in advance!
2 REPLIES 2
ballardw
Super User

I suspect that we would need to see actual input data (both versions) and the entire proc phreg code used for both to get enough details.

 

And which result and by how much difference?

medic
Fluorite | Level 6

Thanks.

Here are some examples of my data structure.

 

Say the study period is until 2013-12-31 as in ID 1.

But if there's a change in any subject due to some time-dependent variable (treatment / as in ID 2 and 3),

I updated the attribute before and after the date of the change as in Table 1.

 

Table 1.

IDstartdate1enddate1treatment1outcome1startdate2enddate2treatment1outcome2
12013-05-012013-12-3100....
22013-06-012013-06-30002013-06-302013-11-2010
32013-07-012013-07-30002013-07-302013-12-0111

 

In order to run this through PROC PHREG, I changed the dataset into long form as following table 2.

(i.e, those who have different states have 2 rows)

 

Table 2.

IDstartendtreatmentoutcome
12013-05-012013-12-3100
22013-06-012013-06-3000
22013-06-302013-11-2010
32013-07-012013-07-3000
32013-07-302013-12-0101

 

and the code goes as:

 

proc phreg data=sample;

class treatment (ref='0');

model (start, end)*outcome(0) = treatment / rl;

strata OOO (some other adjusting variables);

run;

 

But when I run this code, it takes several hours to get the results. (original dataset has about 10 million records)

So instead using (start, end) form, I manually calculated the time-to-event (tte) as following table 3

 

Table 3.

IDttetreatmentoutcome
124400
22900
214310
32900
312401

 

and only changed the (start, end) part into "tte" as :

 

proc phreg data=sample;

class treatment (ref='0');

model tte*outcome(0) = treatment / rl;

strata OOO (some other adjusting variables);

run;

 

In this case, the program only takes minutes, but the results are different from when using (start, end) statement.

(sorry that I don't have exact numbers since I'm running another program right now)

 

* I referred to 

http://support.sas.com/resources/papers/proceedings12/168-2012.pdf

this article when managing the dataset, though it doesn't say anything about whether manually calculating the time-to-event will result the same or not.

 

hope this explains enough to ask for your help.

 

 

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 2 replies
  • 617 views
  • 0 likes
  • 2 in conversation