Best code to prepare data for PHREG. Multiple observations per ID data

Reply
Contributor
Posts: 22

Best code to prepare data for PHREG. Multiple observations per ID data

[ Edited ]

Hi,

 

I have a tumor dataset with multiple observations per ID, and I am intending to use PROC PHREG. First I want to build a dataset with one observation per ID. Besides keeping baseline values. each observation should contain several variables coding whether a cancer recurrent or metastasis happened/ didn't happen, and if so on which date the event occurred/didn't occur.

 

 

My data looks like this: You can put all the codes I written immediately into your SAS and try out yourself.

 

Data have;
INFORMAT CaseType $12.;
FORMAT VisitDate DDMMYYP10.;
Input ID Sex $ VisitNr VisitDate CaseType $;
Cards;
1 m 1 14555 Primary
1 m 2 15000 Recurrent
1 m 3 15550 Metastasis
2 f 1 16520 Primary
2 f 2 17000 Recurrent
2 f 2 17060 Recurrent
3 f 1 14800 Primary
3 f 2 14900 Metastasis
4 m 1 17150 Primary
4 m 2 17750 Rec/Meta
;
RUN;

So as you can see case 2 didn't have any metastasis on the third visit, so I need the latest date on which the patient didn't have an metastasis. But I need the earliest date for an event that happened such as the first recurrence in case 2. My goal is to improve my analysis by using all available follow-up time.

My end dataset should look like this

 

Data want;
Input ID Sex $ PriDate Rec RecDate Metastasis MetaDate;
FORMAT ID 2. sex $1. Rec Meta 1. PriDate RecDate MetaDate DDMMYY10P.;
Cards;
1 m 14555 1 15000 1 15550 
2 f 16520 1 17000 0 17060 
3 f 14800 0 14900 1 14900 
4 m 17150 1 177501 17750
;
RUN;

 

So the variables Rec and Metastasis work basically as censoring variables for the time to recurrence, to metastasis and they have the values 0 or 1.

 

Currently I am using the following code to do this.

DATA want;
FORMAT Rec Meta 1. PriDate RecDate MetaDate DDMMYYP10.;
RETAIN Rec0 Rec1 Meta0 Meta1 PriDate RecDate0 RecDate1 MetaDate0 MetaDate1;
SET have;
BY ID;

IF FIRST.ID THEN DO; Rec0=.; Rec1=.; Meta0=.; Meta1=.; RecDate0=.; RecDate1 =.; MetaDate0 =.; MetaDate1=.; END;

IF CaseType ='Primary' THEN PriDate = VisitDate;
ELSE IF CaseType ='Recurrent' THEN DO; Rec1=1; RecDate1 = min(RecDate1, VisitDate); Meta0=1; MetaDate0=max(MetaDate0, VisitDate); END;
ELSE IF CaseType ='Metastasis' THEN DO; Rec0=1; RecDate0 = max(RecDate0, VisitDate); Meta1=1; MetaDate1=min(MetaDate1, VisitDate); END;
ELSE IF CaseType ='Rec/Meta' THEN DO; Rec1=1; RecDate1 = min(RecDate1, VisitDate); Meta1=1; MetaDate1=min(MetaDate1, VisitDate); END;

IF LAST.ID THEN DO;
IF Rec1 = 1 THEN DO Rec = 1; RecDate=RecDate1; END;
ELSE IF Rec1 ne 1 THEN DO; IF Rec0 = 1 THEN DO; Rec = 0; RecDate=RecDate0; END; END;
IF Meta1 = 1 THEN DO Meta = 1; MetaDate=MetaDate1; END;
ELSE IF Meta1 ne 1 THEN DO; IF Meta0 = 1 THEN DO; Meta = 0; MetaDate=MetaDate0; END; END;
END;
IF LAST.ID THEN OUTPUT;
DROP rec0 rec1 meta0 meta1 RecDate1 RecDate0 MetaDate0 MetaDate1 VisitDate VisitNr CaseType;
RUN;

 

Is there a better way to do this? because it's very intensive and can have mistakes easily. It will be also very unpractical if I have too many possible end events.
Also is there a way may be to do the whole analysis including the PROC PHREG faster? Does The ID statement in the PROC PHREG help?

Finally I would appreciate it a lot if you can paste examples or links to advanced survival analysis codes.

Thanks in forward.

Ubai

Contributor
Posts: 22

Re: Best code to prepare data for PHREG. Multiple observations per ID data

Ok. I just though of a little better way, but I would love to hear from you. Here is a better code using the date of the last visit:

 

DATA want2;
FORMAT Rec Meta 1. PriDate LastDate RecDate MetaDate DDMMYYP10.;
RETAIN Rec Meta PriDate RecDate MetaDate;
SET have;
BY ID;
IF FIRST.ID THEN DO; PriDate = VisitDate; Rec=.; Meta=.; RecDate =.; MetaDate=.; END;

IF CaseType ='Recurrent' THEN DO; Rec=1; RecDate = min(RecDate, VisitDate); END;
ELSE IF CaseType ='Metastasis' THEN DO; Meta=1; MetaDate=min(MetaDate, VisitDate); END;
ELSE IF CaseType ='Rec/Meta' THEN DO; Rec=1; RecDate = min(RecDate, VisitDate); Meta=1; MetaDate=min(MetaDate, VisitDate); END;

IF LAST.ID THEN DO;
LastDate= Visitdate;
IF Rec ne 1 THEN DO; Rec=0; RecDate=VisitDate; END;
IF Meta ne 1 THEN DO; Meta=0; MetaDate=VisitDate; END;
END;
IF LAST.ID THEN OUTPUT;
DROP VisitDate VisitNr CaseType;
RUN;
PROC Star
Posts: 127

Re: Best code to prepare data for PHREG. Multiple observations per ID data

Instead of retaining some variables, setting them missing at first.ID, and having an if last.ID code block, I would prefer putting things in a "doW" loop (named after Ian Whitlock, who has done a lot to popularize it). And all the if-then-else code on a single variable fits naturally in a SELECT statement.

Finally, you should take more care about how you format your code. Using line breaks and indents help a lot, and I think the code becomes a lot more readable if you do NOT capitalize all keywords - there are actually scientific studies showing that most people have an easier time reading lowcase text than capitalized text!

So, if I were to take over your code from here, it would very soon look something like this:

data want;
  format Rec Meta 1. PriDate LastDate RecDate MetaDate DDMMYYP10.;;
  do until(last.ID);
    set have;
    by ID;
    select(CaseType);
      when('Primary') PriDate = VisitDate;
      when('Recurrent') do;
        Rec=1;
        RecDate = min(RecDate, VisitDate);
        end;
      when('Metastasis') do;
        Meta=1;
        MetaDate=min(MetaDate, VisitDate);
        end;
      when('Rec/Meta') do;
        Rec=1;
        RecDate = min(RecDate, VisitDate);
        Meta=1;
        MetaDate=min(MetaDate, VisitDate);
        end;
      otherwise;
      end;
    end;
  LastDate= Visitdate;
  if Rec ne 1 then do;
    Rec=0;
    RecDate=VisitDate;
    end;
  if Meta ne 1 then do;
    Meta=0;
    MetaDate=VisitDate;
    end;
  drop VisitDate VisitNr CaseType;
run;
Contributor
Posts: 22

Re: Best code to prepare data for PHREG. Multiple observations per ID data

Hi Lassen and thanks for replying. I will change my code and utilize the SELECT Statement. I never heard of the doW Loop before. I just found some literature on this topic and I will start reading. Thanks for the hint.

You are right with capitalization. I capitalize few stuff to make the beginning and the end of a step clear for me. This still Needs some improvement though. I also utilize indents and breaks in the original code.

Ask a Question
Discussion stats
  • 3 replies
  • 162 views
  • 1 like
  • 2 in conversation