BookmarkSubscribeRSS Feed
Ubai
Quartz | Level 8

Hi,

 

I have a tumor dataset with multiple observations per ID, and I am intending to use PROC PHREG. First I want to build a dataset with one observation per ID. Besides keeping baseline values. each observation should contain several variables coding whether a cancer recurrent or metastasis happened/ didn't happen, and if so on which date the event occurred/didn't occur.

 

 

My data looks like this: You can put all the codes I written immediately into your SAS and try out yourself.

 

Data have;
INFORMAT CaseType $12.;
FORMAT VisitDate DDMMYYP10.;
Input ID Sex $ VisitNr VisitDate CaseType $;
Cards;
1 m 1 14555 Primary
1 m 2 15000 Recurrent
1 m 3 15550 Metastasis
2 f 1 16520 Primary
2 f 2 17000 Recurrent
2 f 2 17060 Recurrent
3 f 1 14800 Primary
3 f 2 14900 Metastasis
4 m 1 17150 Primary
4 m 2 17750 Rec/Meta
;
RUN;

So as you can see case 2 didn't have any metastasis on the third visit, so I need the latest date on which the patient didn't have an metastasis. But I need the earliest date for an event that happened such as the first recurrence in case 2. My goal is to improve my analysis by using all available follow-up time.

My end dataset should look like this

 

Data want;
Input ID Sex $ PriDate Rec RecDate Metastasis MetaDate;
FORMAT ID 2. sex $1. Rec Meta 1. PriDate RecDate MetaDate DDMMYY10P.;
Cards;
1 m 14555 1 15000 1 15550 
2 f 16520 1 17000 0 17060 
3 f 14800 0 14900 1 14900 
4 m 17150 1 177501 17750
;
RUN;

 

So the variables Rec and Metastasis work basically as censoring variables for the time to recurrence, to metastasis and they have the values 0 or 1.

 

Currently I am using the following code to do this.

DATA want;
FORMAT Rec Meta 1. PriDate RecDate MetaDate DDMMYYP10.;
RETAIN Rec0 Rec1 Meta0 Meta1 PriDate RecDate0 RecDate1 MetaDate0 MetaDate1;
SET have;
BY ID;

IF FIRST.ID THEN DO; Rec0=.; Rec1=.; Meta0=.; Meta1=.; RecDate0=.; RecDate1 =.; MetaDate0 =.; MetaDate1=.; END;

IF CaseType ='Primary' THEN PriDate = VisitDate;
ELSE IF CaseType ='Recurrent' THEN DO; Rec1=1; RecDate1 = min(RecDate1, VisitDate); Meta0=1; MetaDate0=max(MetaDate0, VisitDate); END;
ELSE IF CaseType ='Metastasis' THEN DO; Rec0=1; RecDate0 = max(RecDate0, VisitDate); Meta1=1; MetaDate1=min(MetaDate1, VisitDate); END;
ELSE IF CaseType ='Rec/Meta' THEN DO; Rec1=1; RecDate1 = min(RecDate1, VisitDate); Meta1=1; MetaDate1=min(MetaDate1, VisitDate); END;

IF LAST.ID THEN DO;
IF Rec1 = 1 THEN DO Rec = 1; RecDate=RecDate1; END;
ELSE IF Rec1 ne 1 THEN DO; IF Rec0 = 1 THEN DO; Rec = 0; RecDate=RecDate0; END; END;
IF Meta1 = 1 THEN DO Meta = 1; MetaDate=MetaDate1; END;
ELSE IF Meta1 ne 1 THEN DO; IF Meta0 = 1 THEN DO; Meta = 0; MetaDate=MetaDate0; END; END;
END;
IF LAST.ID THEN OUTPUT;
DROP rec0 rec1 meta0 meta1 RecDate1 RecDate0 MetaDate0 MetaDate1 VisitDate VisitNr CaseType;
RUN;

 

Is there a better way to do this? because it's very intensive and can have mistakes easily. It will be also very unpractical if I have too many possible end events.
Also is there a way may be to do the whole analysis including the PROC PHREG faster? Does The ID statement in the PROC PHREG help?

Finally I would appreciate it a lot if you can paste examples or links to advanced survival analysis codes.

Thanks in forward.

Ubai

3 REPLIES 3
Ubai
Quartz | Level 8

Ok. I just though of a little better way, but I would love to hear from you. Here is a better code using the date of the last visit:

 

DATA want2;
FORMAT Rec Meta 1. PriDate LastDate RecDate MetaDate DDMMYYP10.;
RETAIN Rec Meta PriDate RecDate MetaDate;
SET have;
BY ID;
IF FIRST.ID THEN DO; PriDate = VisitDate; Rec=.; Meta=.; RecDate =.; MetaDate=.; END;

IF CaseType ='Recurrent' THEN DO; Rec=1; RecDate = min(RecDate, VisitDate); END;
ELSE IF CaseType ='Metastasis' THEN DO; Meta=1; MetaDate=min(MetaDate, VisitDate); END;
ELSE IF CaseType ='Rec/Meta' THEN DO; Rec=1; RecDate = min(RecDate, VisitDate); Meta=1; MetaDate=min(MetaDate, VisitDate); END;

IF LAST.ID THEN DO;
LastDate= Visitdate;
IF Rec ne 1 THEN DO; Rec=0; RecDate=VisitDate; END;
IF Meta ne 1 THEN DO; Meta=0; MetaDate=VisitDate; END;
END;
IF LAST.ID THEN OUTPUT;
DROP VisitDate VisitNr CaseType;
RUN;
s_lassen
Meteorite | Level 14

Instead of retaining some variables, setting them missing at first.ID, and having an if last.ID code block, I would prefer putting things in a "doW" loop (named after Ian Whitlock, who has done a lot to popularize it). And all the if-then-else code on a single variable fits naturally in a SELECT statement.

Finally, you should take more care about how you format your code. Using line breaks and indents help a lot, and I think the code becomes a lot more readable if you do NOT capitalize all keywords - there are actually scientific studies showing that most people have an easier time reading lowcase text than capitalized text!

So, if I were to take over your code from here, it would very soon look something like this:

data want;
  format Rec Meta 1. PriDate LastDate RecDate MetaDate DDMMYYP10.;;
  do until(last.ID);
    set have;
    by ID;
    select(CaseType);
      when('Primary') PriDate = VisitDate;
      when('Recurrent') do;
        Rec=1;
        RecDate = min(RecDate, VisitDate);
        end;
      when('Metastasis') do;
        Meta=1;
        MetaDate=min(MetaDate, VisitDate);
        end;
      when('Rec/Meta') do;
        Rec=1;
        RecDate = min(RecDate, VisitDate);
        Meta=1;
        MetaDate=min(MetaDate, VisitDate);
        end;
      otherwise;
      end;
    end;
  LastDate= Visitdate;
  if Rec ne 1 then do;
    Rec=0;
    RecDate=VisitDate;
    end;
  if Meta ne 1 then do;
    Meta=0;
    MetaDate=VisitDate;
    end;
  drop VisitDate VisitNr CaseType;
run;
Ubai
Quartz | Level 8

Hi Lassen and thanks for replying. I will change my code and utilize the SELECT Statement. I never heard of the doW Loop before. I just found some literature on this topic and I will start reading. Thanks for the hint.

You are right with capitalization. I capitalize few stuff to make the beginning and the end of a step clear for me. This still Needs some improvement though. I also utilize indents and breaks in the original code.

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 3 replies
  • 533 views
  • 1 like
  • 2 in conversation