BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
YYK273
Obsidian | Level 7

Hi SAS Community,

I am working on a Cox proportional hazards model, and the test of proportional assumption has indicated violations for five covariates. To address this, I am attempting to create time-varying covariates within PROC PHREG in SAS.

 

The issue I am facing is related to the creation of these time-varying covariates. Out of the five covariates, one is continuous, and the remaining four are categorical. However, when I run my code, the SAS log displays an error that 4 categorical Variables are not found. Only the continuous variable is created. 

 

Here is my code:

proc phreg data=cox_001 covs(aggregate);
CLASS  RArace_time1(ref=first)  RAEDUC4_time1(ref=last) HwATOTBcd_time1(ref=last) Cohort_time1(ref='3.Hrs')
ragender(ref=last)  RWSTROKE(ref=first) RWDIABE(ref=first) RWHEARTE(ref=first) RWHIBPE(ref=first) BMIcd(ref='2.18.5-24.9') 
clustervar/ PARAM=REF ;
MODEL Time*event1(0,2)=BLage_time1  RArace_time1  RAEDUC4_time1 Cohort_time1 HwATOTBcd_time1
ragender RWSTROKE RWDIABE RWHEARTE RWHIBPE BMIcd/RL;

   if Time lt 7 then do; BLage_time1=BLRAGEY;end;   else do;BLage_time1=0;end;
   if Time ge 7 then do; BLage_time2=BLRAGEY;end;   else do;BLage_time1=0;end;
   if Time lt 7 then do; RArace_time1=RArace;end;   else do;RArace_time1=0;end;
   if Time ge 7 then do; RArace_time2=RArace;end;   else do;RArace_time2=0;end;
   if Time lt 7 then do; RAEDUC4_time1=RAEDUC4;end;   else do;RAEDUC4_time1=0;end;
   if Time ge 7 then do; RAEDUC4_time2=RAEDUC4;end;   else do;RAEDUC4_time2=0;end;
   if Time lt 7 then do; Cohort_time1=RACOHBYR;end;   else do;Cohort_time1=0;end;
   if Time ge 7 then do; Cohort_time2=RACOHBYR;end;   else do;Cohort_time2=0;end;
   if Time lt 7 then do; HwATOTBcd_time1=HwATOTBcd;end;   else do;HwATOTBcd_time1=0;end;
   if Time ge 7 then do; HwATOTBcd_time2=HwATOTBcd;end;   else do;HwATOTBcd_time2=0;end;
end;
format RArace_time2 RAracef. RAEDUC4_time2 RAEDUC4f. Cohort_time2 RACOHBYR2f. HwATOTBcd_time2 HwATOTBcd1f.
       RArace_time1 RAracef. RAEDUC4_time1 RAEDUC4f. Cohort_time1 RACOHBYR2f. HwATOTBcd_time1 HwATOTBcd1f.;
id clustervar;  weight rWtcrnh2sd; 
ODS OUTPUT PARAMETERESTIMATES=parest;
run;quit;

SAS log:

39 proc phreg data=cox_001 covs(aggregate);
40 CLASS RArace_time1(ref=first) ragender(ref='2.Female')
ERROR: Variable RARACE_TIME1 not found.
41 RAEDUC4_time1(ref=last) HwATOTBcd_time1(ref=last)
ERROR: Variable RAEDUC4_TIME1 not found.
ERROR: Variable HWATOTBCD_TIME1 not found.
42 RWSTROKE(ref=first) RWDIABE(ref=first) RWHEARTE(ref=first) RWHIBPE(ref=first) BMIcd(ref='2.18.5-24.9')
43 Cohort_time1(ref='3.Hrs') clustervar/ PARAM=REF ;
ERROR: Variable Cohort_time1 not found.
44 MODEL T_CIND1yr*event1(0,2)=BLRAGEY_time1 RArace_time1 ragender RAEDUC4_time1 Cohort_time1
45 HwATOTBcd_time1 RWSTROKE RWDIABE RWHEARTE RWHIBPE BMIcd/RL;
46
47 if T_CIND1yr lt 7 then do; BLRAGEY_time1=BLRAGEY;end; else do;BLRAGEY_time1=0;end;
48 if T_CIND1yr ge 7 then do; BLRAGEY_time2=BLRAGEY;end; else do;BLRAGEY_time2=0;end;
49 if T_CIND1yr lt 7 then do; RArace_time1=RArace;end; else do;RArace_time1=0;end;
50 if T_CIND1yr ge 7 then do; RArace_time2=RArace;end; else do;RArace_time2=0;end;
51 if T_CIND1yr lt 7 then do; RAEDUC4_time1=RAEDUC4;end; else do;RAEDUC4_time1=0;end;
52 if T_CIND1yr ge 7 then do; RAEDUC4_time2=RAEDUC4;end; else do;RAEDUC4_time2=0;end;
53 if T_CIND1yr lt 7 then do; Cohort_time1=RACOHBYR;end; else do;Cohort_time1=0;end;
54 if T_CIND1yr ge 7 then do; Cohort_time2=RACOHBYR;end; else do;Cohort_time2=0;end;
55 if T_CIND1yr lt 7 then do; HwATOTBcd_time1=HwATOTBcd;end; else do;HwATOTBcd_time1=0;end;
56 if T_CIND1yr ge 7 then do; HwATOTBcd_time2=HwATOTBcd;end; else do;HwATOTBcd_time2=0;end;
57 end;
58 format RArace_time2 RAracef. RAEDUC4_time2 RAEDUC4f. Cohort_time2 RACOHBYR2f. HwATOTBcd_time2 HwATOTBcd1f.
59 RArace_time1 RAracef. RAEDUC4_time1 RAEDUC4f. Cohort_time1 RACOHBYR2f. HwATOTBcd_time1 HwATOTBcd1f.;
60 id clustervar; weight rWtcrnh2sd;
61 ODS OUTPUT PARAMETERESTIMATES=parest;
62 run;


NOTE: The SAS System stopped processing this step because of errors.

 

Here are my specific questions:

  1. Does the code look correct for creating time-varying covariates, considering that one is continuous and the rest are categorical?
  2. is the way I format the created variables correct?
  3. Are there any potential pitfalls or improvements that you would recommend?

Thank you in advance for taking the time to review my code.

 

 

 

 

1 ACCEPTED SOLUTION

Accepted Solutions
OsoGris
SAS Employee

Please note that when using programming statements to create time-dependent covariates they must appear inside the PROC PHREG step.  You cannot create them in a prior DATA step unless you are using the model (t1, t2) counting-process syntax (instead of programming statements).  Please see the following SAS NOTE:

Usage Note 24554: Why do I get different results when I form time-dependent covariates in a DATA step as opposed to forming them inside a PROC PHREG step?

https://support.sas.com/kb/24/554.html

 

 

View solution in original post

8 REPLIES 8
ballardw
Super User

Best practice when asking about any error message is to include the code from the Log along with all the notes, warnings and error message text.

 

At the very least mention which variables are the problem.

 

I'm not sure what your mean in this case by "formatting approach". Can you describe more of your concerns?

YYK273
Obsidian | Level 7
I really appreciate your instruction! I added the log which showed 4 categorical variables not found. Since they are newly created variables, and I used ref in the class statement, did i format these new variables correctly?
ballardw
Super User

I don't use Phreg much but I think I get a clue from this statement in the online help regarding programming statements:

Programming statements are used to create or modify the values of the explanatory variables in the MODEL statement.

I think in your case the operative part is "create values", not variables. The variable you modify a a value for seems like it needs to exist prior to the programming statement assigning the value.

 

So you likely have two options:

Add the variable to the input data set with missing values using an appropriate statement to create the type of variable needed and then your programming statements in the Phreg call assigns the values.

Or move the Phreg code into a data step to create the variables and the desired values before calling proc phreg.

I think the second approach would be my choice as then I have the values in a set that can be used for other purposes if needed.

Then if you find you need tweaks to values then do that in different Phreg models.

 

OsoGris
SAS Employee

Please note that when using programming statements to create time-dependent covariates they must appear inside the PROC PHREG step.  You cannot create them in a prior DATA step unless you are using the model (t1, t2) counting-process syntax (instead of programming statements).  Please see the following SAS NOTE:

Usage Note 24554: Why do I get different results when I form time-dependent covariates in a DATA step as opposed to forming them inside a PROC PHREG step?

https://support.sas.com/kb/24/554.html

 

 

YYK273
Obsidian | Level 7

Your input has been immensely helpful! 

While most resources I've come across focus on creating continuous covariates, my specific need is to generate other four categorical covariates within a PROC PHREG step. The error is arising in the class statement.

I would be immensely grateful if someone could share an example or provide guidance on how to appropriately build categorical covariates within a PROC PHREG step.

 

Thank you once again!

ballardw
Super User

@YYK273 wrote:

Your input has been immensely helpful! 

While most resources I've come across focus on creating continuous covariates, my specific need is to generate other four categorical covariates within a PROC PHREG step. The error is arising in the class statement.

I would be immensely grateful if someone could share an example or provide guidance on how to appropriately build categorical covariates within a PROC PHREG step.

 

Thank you once again!


You must have the variables on the CLASS statement in the data set that is used before calling proc phreg.

They can have missing values but the variables have to be there.

BEFORE your Proc freq, to add the variables, assuming they are numeric:

data cox_001;
   set cox_001;
   retain  RARACE_TIME1  RAEDUC4_TIME1 HWATOTBCD_TIME1 Cohort_time1 . ;
run;

The Retain basically is there to prevent Log messages of Variable XXX has never been referenced type messages.

IF these reference  values like

Cohort_time1(ref='3.Hrs')

are coming from a custom format that you have not shown then you will have to associate the format with the new class variables as well.

YYK273
Obsidian | Level 7

This is the SAS log after retain. 

509 data lifecox_002; set lifecox_001;
510 retain RARACE_TIME1 RAEDUC4_TIME1 HWATOTBCD_TIME1 RACOHBYR_time1 RARACE_TIME2 RAEDUC4_TIME2 HWATOTBCD_TIME2 RACOHBYR_time2 .;
511 run;

NOTE: There were 11195 observations read from the data set LIFECOX_001.
NOTE: The data set LIFECOX_002 has 11195 observations and 108 variables.
NOTE: DATA statement used (Total process time):
real time 0.02 seconds
cpu time 0.00 seconds


512
513 proc phreg data=lifecox_002 covs(aggregate);
514 CLASS RArace_time1(ref=first) RAEDUC4_time1(ref=last) HwATOTBcd_time1(ref=last) RACOHBYR_time1(ref='3.Hrs')
515 ragender(ref=last) RWSTROKE(ref=first) RWDIABE(ref=first) RWHEARTE(ref=first) RWHIBPE(ref=first) BMIcd(ref='2.18.5-24.9')
516 clustervar/ PARAM=REF ;
517 MODEL T_CIND1yr*event1(0,2)=BLRAGEY_time1 RArace_time1 ragender RAEDUC4_time1 RACOHBYR_time1
518 HwATOTBcd_time1 RWSTROKE RWDIABE RWHEARTE RWHIBPE BMIcd/RL;
519 if T_CIND1yr lt 7 then do;
520 BLRAGEY_time1=BLRAGEY; RArace_time1=RArace; RAEDUC4_time1=RAEDUC4; RACOHBYR_time1=RACOHBYR; HwATOTBcd_time1=HwATOTBcd;end;
521 else do;
522 BLRAGEY_time1=0; RArace_time1=0; RAEDUC4_time1=0; RACOHBYR_time1=0; HwATOTBcd_time1=0;end;
523 format /* RArace_time2 RAracef. RAEDUC4_time2 RAEDUC4f. RACOHBYR_time2 RACOHBYR2f. HwATOTBcd_time2 HwATOTBcd1f.*/
524 RArace_time1 RAracef. RAEDUC4_time1 RAEDUC4f. RACOHBYR_time1 RACOHBYR2f. HwATOTBcd_time1 HwATOTBcd1f.;
525 id clustervar; weight rWtcrnh2sd;
526 run;


ERROR: There are no valid observations.
NOTE: The SAS System stopped processing this step because of errors.
NOTE: PROCEDURE PHREG used (Total process time):
real time 0.03 seconds
cpu time 0.00 seconds

 

YYK273
Obsidian | Level 7
I appreciate your replies! They are very helpful.
I used dummy variables so the variables weren't in the class statement to prevent the error.

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 8 replies
  • 784 views
  • 1 like
  • 3 in conversation