Hi SAS Community,
I am working on a Cox proportional hazards model, and the test of proportional assumption has indicated violations for five covariates. To address this, I am attempting to create time-varying covariates within PROC PHREG in SAS.
The issue I am facing is related to the creation of these time-varying covariates. Out of the five covariates, one is continuous, and the remaining four are categorical. However, when I run my code, the SAS log displays an error that 4 categorical Variables are not found. Only the continuous variable is created.
Here is my code:
proc phreg data=cox_001 covs(aggregate);
CLASS RArace_time1(ref=first) RAEDUC4_time1(ref=last) HwATOTBcd_time1(ref=last) Cohort_time1(ref='3.Hrs')
ragender(ref=last) RWSTROKE(ref=first) RWDIABE(ref=first) RWHEARTE(ref=first) RWHIBPE(ref=first) BMIcd(ref='2.18.5-24.9')
clustervar/ PARAM=REF ;
MODEL Time*event1(0,2)=BLage_time1 RArace_time1 RAEDUC4_time1 Cohort_time1 HwATOTBcd_time1
ragender RWSTROKE RWDIABE RWHEARTE RWHIBPE BMIcd/RL;
if Time lt 7 then do; BLage_time1=BLRAGEY;end; else do;BLage_time1=0;end;
if Time ge 7 then do; BLage_time2=BLRAGEY;end; else do;BLage_time1=0;end;
if Time lt 7 then do; RArace_time1=RArace;end; else do;RArace_time1=0;end;
if Time ge 7 then do; RArace_time2=RArace;end; else do;RArace_time2=0;end;
if Time lt 7 then do; RAEDUC4_time1=RAEDUC4;end; else do;RAEDUC4_time1=0;end;
if Time ge 7 then do; RAEDUC4_time2=RAEDUC4;end; else do;RAEDUC4_time2=0;end;
if Time lt 7 then do; Cohort_time1=RACOHBYR;end; else do;Cohort_time1=0;end;
if Time ge 7 then do; Cohort_time2=RACOHBYR;end; else do;Cohort_time2=0;end;
if Time lt 7 then do; HwATOTBcd_time1=HwATOTBcd;end; else do;HwATOTBcd_time1=0;end;
if Time ge 7 then do; HwATOTBcd_time2=HwATOTBcd;end; else do;HwATOTBcd_time2=0;end;
end;
format RArace_time2 RAracef. RAEDUC4_time2 RAEDUC4f. Cohort_time2 RACOHBYR2f. HwATOTBcd_time2 HwATOTBcd1f.
RArace_time1 RAracef. RAEDUC4_time1 RAEDUC4f. Cohort_time1 RACOHBYR2f. HwATOTBcd_time1 HwATOTBcd1f.;
id clustervar; weight rWtcrnh2sd;
ODS OUTPUT PARAMETERESTIMATES=parest;
run;quit;
SAS log:
39 proc phreg data=cox_001 covs(aggregate);
40 CLASS RArace_time1(ref=first) ragender(ref='2.Female')
ERROR: Variable RARACE_TIME1 not found.
41 RAEDUC4_time1(ref=last) HwATOTBcd_time1(ref=last)
ERROR: Variable RAEDUC4_TIME1 not found.
ERROR: Variable HWATOTBCD_TIME1 not found.
42 RWSTROKE(ref=first) RWDIABE(ref=first) RWHEARTE(ref=first) RWHIBPE(ref=first) BMIcd(ref='2.18.5-24.9')
43 Cohort_time1(ref='3.Hrs') clustervar/ PARAM=REF ;
ERROR: Variable Cohort_time1 not found.
44 MODEL T_CIND1yr*event1(0,2)=BLRAGEY_time1 RArace_time1 ragender RAEDUC4_time1 Cohort_time1
45 HwATOTBcd_time1 RWSTROKE RWDIABE RWHEARTE RWHIBPE BMIcd/RL;
46
47 if T_CIND1yr lt 7 then do; BLRAGEY_time1=BLRAGEY;end; else do;BLRAGEY_time1=0;end;
48 if T_CIND1yr ge 7 then do; BLRAGEY_time2=BLRAGEY;end; else do;BLRAGEY_time2=0;end;
49 if T_CIND1yr lt 7 then do; RArace_time1=RArace;end; else do;RArace_time1=0;end;
50 if T_CIND1yr ge 7 then do; RArace_time2=RArace;end; else do;RArace_time2=0;end;
51 if T_CIND1yr lt 7 then do; RAEDUC4_time1=RAEDUC4;end; else do;RAEDUC4_time1=0;end;
52 if T_CIND1yr ge 7 then do; RAEDUC4_time2=RAEDUC4;end; else do;RAEDUC4_time2=0;end;
53 if T_CIND1yr lt 7 then do; Cohort_time1=RACOHBYR;end; else do;Cohort_time1=0;end;
54 if T_CIND1yr ge 7 then do; Cohort_time2=RACOHBYR;end; else do;Cohort_time2=0;end;
55 if T_CIND1yr lt 7 then do; HwATOTBcd_time1=HwATOTBcd;end; else do;HwATOTBcd_time1=0;end;
56 if T_CIND1yr ge 7 then do; HwATOTBcd_time2=HwATOTBcd;end; else do;HwATOTBcd_time2=0;end;
57 end;
58 format RArace_time2 RAracef. RAEDUC4_time2 RAEDUC4f. Cohort_time2 RACOHBYR2f. HwATOTBcd_time2 HwATOTBcd1f.
59 RArace_time1 RAracef. RAEDUC4_time1 RAEDUC4f. Cohort_time1 RACOHBYR2f. HwATOTBcd_time1 HwATOTBcd1f.;
60 id clustervar; weight rWtcrnh2sd;
61 ODS OUTPUT PARAMETERESTIMATES=parest;
62 run;
NOTE: The SAS System stopped processing this step because of errors.
Here are my specific questions:
Thank you in advance for taking the time to review my code.
Please note that when using programming statements to create time-dependent covariates they must appear inside the PROC PHREG step. You cannot create them in a prior DATA step unless you are using the model (t1, t2) counting-process syntax (instead of programming statements). Please see the following SAS NOTE:
https://support.sas.com/kb/24/554.html
Best practice when asking about any error message is to include the code from the Log along with all the notes, warnings and error message text.
At the very least mention which variables are the problem.
I'm not sure what your mean in this case by "formatting approach". Can you describe more of your concerns?
I don't use Phreg much but I think I get a clue from this statement in the online help regarding programming statements:
Programming statements are used to create or modify the values of the explanatory variables in the MODEL statement.
I think in your case the operative part is "create values", not variables. The variable you modify a a value for seems like it needs to exist prior to the programming statement assigning the value.
So you likely have two options:
Add the variable to the input data set with missing values using an appropriate statement to create the type of variable needed and then your programming statements in the Phreg call assigns the values.
Or move the Phreg code into a data step to create the variables and the desired values before calling proc phreg.
I think the second approach would be my choice as then I have the values in a set that can be used for other purposes if needed.
Then if you find you need tweaks to values then do that in different Phreg models.
Please note that when using programming statements to create time-dependent covariates they must appear inside the PROC PHREG step. You cannot create them in a prior DATA step unless you are using the model (t1, t2) counting-process syntax (instead of programming statements). Please see the following SAS NOTE:
https://support.sas.com/kb/24/554.html
Your input has been immensely helpful!
While most resources I've come across focus on creating continuous covariates, my specific need is to generate other four categorical covariates within a PROC PHREG step. The error is arising in the class statement.
I would be immensely grateful if someone could share an example or provide guidance on how to appropriately build categorical covariates within a PROC PHREG step.
Thank you once again!
@YYK273 wrote:
Your input has been immensely helpful!
While most resources I've come across focus on creating continuous covariates, my specific need is to generate other four categorical covariates within a PROC PHREG step. The error is arising in the class statement.
I would be immensely grateful if someone could share an example or provide guidance on how to appropriately build categorical covariates within a PROC PHREG step.
Thank you once again!
You must have the variables on the CLASS statement in the data set that is used before calling proc phreg.
They can have missing values but the variables have to be there.
BEFORE your Proc freq, to add the variables, assuming they are numeric:
data cox_001; set cox_001; retain RARACE_TIME1 RAEDUC4_TIME1 HWATOTBCD_TIME1 Cohort_time1 . ; run;
The Retain basically is there to prevent Log messages of Variable XXX has never been referenced type messages.
IF these reference values like
Cohort_time1(ref='3.Hrs')
are coming from a custom format that you have not shown then you will have to associate the format with the new class variables as well.
This is the SAS log after retain.
509 data lifecox_002; set lifecox_001;
510 retain RARACE_TIME1 RAEDUC4_TIME1 HWATOTBCD_TIME1 RACOHBYR_time1 RARACE_TIME2 RAEDUC4_TIME2 HWATOTBCD_TIME2 RACOHBYR_time2 .;
511 run;
NOTE: There were 11195 observations read from the data set LIFECOX_001.
NOTE: The data set LIFECOX_002 has 11195 observations and 108 variables.
NOTE: DATA statement used (Total process time):
real time 0.02 seconds
cpu time 0.00 seconds
512
513 proc phreg data=lifecox_002 covs(aggregate);
514 CLASS RArace_time1(ref=first) RAEDUC4_time1(ref=last) HwATOTBcd_time1(ref=last) RACOHBYR_time1(ref='3.Hrs')
515 ragender(ref=last) RWSTROKE(ref=first) RWDIABE(ref=first) RWHEARTE(ref=first) RWHIBPE(ref=first) BMIcd(ref='2.18.5-24.9')
516 clustervar/ PARAM=REF ;
517 MODEL T_CIND1yr*event1(0,2)=BLRAGEY_time1 RArace_time1 ragender RAEDUC4_time1 RACOHBYR_time1
518 HwATOTBcd_time1 RWSTROKE RWDIABE RWHEARTE RWHIBPE BMIcd/RL;
519 if T_CIND1yr lt 7 then do;
520 BLRAGEY_time1=BLRAGEY; RArace_time1=RArace; RAEDUC4_time1=RAEDUC4; RACOHBYR_time1=RACOHBYR; HwATOTBcd_time1=HwATOTBcd;end;
521 else do;
522 BLRAGEY_time1=0; RArace_time1=0; RAEDUC4_time1=0; RACOHBYR_time1=0; HwATOTBcd_time1=0;end;
523 format /* RArace_time2 RAracef. RAEDUC4_time2 RAEDUC4f. RACOHBYR_time2 RACOHBYR2f. HwATOTBcd_time2 HwATOTBcd1f.*/
524 RArace_time1 RAracef. RAEDUC4_time1 RAEDUC4f. RACOHBYR_time1 RACOHBYR2f. HwATOTBcd_time1 HwATOTBcd1f.;
525 id clustervar; weight rWtcrnh2sd;
526 run;
ERROR: There are no valid observations.
NOTE: The SAS System stopped processing this step because of errors.
NOTE: PROCEDURE PHREG used (Total process time):
real time 0.03 seconds
cpu time 0.00 seconds
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.
ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.
Find more tutorials on the SAS Users YouTube channel.