BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
morris
Calcite | Level 5

Hi all,

 

I am running a survival analysis in PHREG in which I am interested in predicting survival from the interaction of 2 categorical variables, each with two levels (treatment = control or experimental; size class = small (S) or large (L)).  I am using version 9.4

 

I first ran the code:

proc phreg data=males;
class treatment size_class;
model exposuredays*event(0)=treatment*size_class/ties=exact;
run;

I then decided I wanted to set the reference level for each class variable to be able to make more intuitive comparisons and ran the following code:

 

proc phreg data=males;
class treatment (ref='control') size_class (ref='S');
model exposuredays*event(0)=treatment*size_class/ties=exact;
run;

I thought the p-values, AIC values, etc would be same from each, but they are not even close.  I found some lecture notes online which said that without reference coding the model "estimates the difference in the effect of each level compared to the average effect over all four levels" and that with reference coding the model "estimates the difference in the effect of each level compared to the reference level."  (http://www.misug.org/uploads/8/1/9/1/8191072/bgillespie_phreg.pdf)  However, I don't fully understand 1) what this means, or 2) which specification is most appropriate for my analysis.  It seems to me that with two categorical levels for each effect, using referencing would be more appropriate, but as I don't fully understand why there is a difference in the first place, I am not very confident about that.  I feel like this must have come up before but I didn't see it in previous posts.  Can anyone help me to understand?

 

Thanks in advance for any suggestions!

1 ACCEPTED SOLUTION

Accepted Solutions
Reeza
Super User
You want to add PARAM=REF as well. WHen you dummy code a variable, you can use 0/1 to code it. But you can also use different coding systems, such as -1, 0, 1 for different levels. However, then you're testing a different hypothesis and get different results. Stick with PARAM=REF to understand what you get. It's also the most commonly used parameterization method.

View solution in original post

2 REPLIES 2
Reeza
Super User
You want to add PARAM=REF as well. WHen you dummy code a variable, you can use 0/1 to code it. But you can also use different coding systems, such as -1, 0, 1 for different levels. However, then you're testing a different hypothesis and get different results. Stick with PARAM=REF to understand what you get. It's also the most commonly used parameterization method.
morris
Calcite | Level 5

Thank you for the quick response!  Apologies for taking so long to reply.  I did not have access to a computer with SAS on it over the weekend.  Much appreciated.

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 2 replies
  • 742 views
  • 0 likes
  • 2 in conversation