BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Case
Calcite | Level 5

Hey all.

I hope that some on can help me with a problem in Proc Logistic with the param option in the class statement in SAS version 9.4. My problem is:

 

I have two exposures that I’m investigating in relation to an outcome with 4 categories.

My coding in SAS is as follows:

 

ods graphics on;

proc logistic data = xxxx plots = all plots (maxpoints = none);

           class exposure1 (ref = 'group1') covar1 (ref = 'group2') covar2 (ref = 'group1') / param = ref; /* or param = effect (default)*/

           model outcome (ref = 'last group') = exposure1 exposure2 covar1 covar2 exposure1*exposure2 / link = glogit;

           oddsratio exposure1;

           oddsratio exposure2 / diff = ref;

run;

ods graphics off;

 

 

In my model I include the following:

Exposure1 Included in the class and model statement, a categorical variable with 4 categories.

Exposure2 included in the model statement, a continuous variable (0 to 1875).

Covar = co-variables. I include 9 in my analysis – all categorical, (I have defined a specific reference groups for each covar, mainly the first or the last category).

Outcome is a categorical variable with 4 categories and I therefore define the reference group for the outcome as well. “link = glogit” is therefore included in the model-statement.

 

The reason I’m a bit concerned is that the p-value for the continuous exposure variable (exposure2) is significant, when I use param = effect, while the continuous variable becomes non-significant when I use param = ref, even though the param option does not concern the continuous variable. I can understand that the param option effects the class level information, but how come this also influences the p-value.

 

These p-values are the ones I get in my analysis with my variables:

Param = ref: P-value = 0.7984 for exposure2 (continuous variable).

Param = effect: P-value = 0.0362 for exposure2 (continuous variable).

 

While the p-value for the exposure1 and the interaction does not change when I change the param-definition

Param = ref and param = effect: P-value for exposure1 = 0.0605

Param = ref and param = effect: P-value for interaction = 0.0066

 

Attached is a SAS-program illustrating the issue I have explained above. The data is constructed and therefore the data does not correspond to the p-values presented in this email.

 

To investigate my data further in Proc Logistic and to understand this problem better, I have also investigated two continuous exposures and their interaction with param = ref and param = effect, respectively. In this case, there was no difference between the p-values using these two options. Furthermore, I also investigated two categorical exposures and their interaction with param = ref and param = effect, respectively. In this case, the p-value for both of the main effects (exposure1 and exposure2) changed between the two param options, but the p-value for the interaction remained the same.

 

My problem then is:

Why does the p-value change for the continuous variable when I define the param-option in the class statement? And what does this mean (interpretation)? Which one should I choose to use (param = ref or param = effect)? What is the difference between these two?

In the future, how do I choose whether I should use param = ref or param = effect?

Furthermore, I have also read that the estimate from the "Analysis of Maximum Likelihoos Estimates" does not correspont to the Odds Ratios presented if I choose to you param = effect. But the ORs are similar no mater which param option (ref or effect) I choose. Is it then possible to use the overall joint p-value for a variable if the estimate is in-significant while the ORs are signifant? How do I interprete if the hypothesis I test is significant or not?

 

As the continuous exposure variable is the one I’m interested in understanding better, it is really important for me to be sure that I’m analyzing the data correct.

 

I hope that someone can help me – thank you.

1 ACCEPTED SOLUTION

Accepted Solutions
StatDave
SAS Super FREQ

The models under the two parameterizations are equivalent models, meaning they have the same log likelihood (-2LogL for Intercept and Covariates) and the same predicted values for any setting of the predictors. Because continuous exposure2 is involved in interaction with CLASS exposure1, its parameter has to change because the parameters for exposure1 and exposure1*exposure2 must change due to the change in reference (and therefore interpretation as Rick mentioned) in order for the predicted values to stay the same. But as in all models of whatever type, when a variable is involved in an interaction with another variable, its effect can only be interpreted at each level of the interacting variable.  So, the p-value for exposure2 is really not relevant given that it interacts with exposure1. But to clarify the reason for the change, the interpretation of the exposure2 parameter is the effect of a unit increase in exposure2 at the reference level of the CLASS variables.  And the effect of exposure2 at the level might be different and might or might not be significantly different from zero. Either parameterization is fine, you just have to be sure any statement of the results are consistent with the interpretation imposed by the parameterization that is used.

View solution in original post

3 REPLIES 3
Rick_SAS
SAS Super FREQ

Which one should I choose to use (param = ref or param = effect)? What is the difference between these two? In the future, how do I choose whether I should use param = ref or param = effect?

 

The various parameterizations are described in the SAS/STAT documentation chapter "Parameterization of Model Effects". I think the important facts are

1. For the EFFECT coding scheme, the parameter estimates of the main effect "estimate the difference in the effect of each nonreference level compared to the average effect over all levels."

2. For the REFERENCE coding scheme,  the parameter estimates of the main effect "estimate the difference in the effect of each nonreference level compared to the effect of the reference level."

 

So EFFECT compares each level to the average, whereas REFERENCE compares each (nonreference) level to the reference level. Which one you choose depends on the scientific question you are trying to answer. 

StatDave
SAS Super FREQ

The models under the two parameterizations are equivalent models, meaning they have the same log likelihood (-2LogL for Intercept and Covariates) and the same predicted values for any setting of the predictors. Because continuous exposure2 is involved in interaction with CLASS exposure1, its parameter has to change because the parameters for exposure1 and exposure1*exposure2 must change due to the change in reference (and therefore interpretation as Rick mentioned) in order for the predicted values to stay the same. But as in all models of whatever type, when a variable is involved in an interaction with another variable, its effect can only be interpreted at each level of the interacting variable.  So, the p-value for exposure2 is really not relevant given that it interacts with exposure1. But to clarify the reason for the change, the interpretation of the exposure2 parameter is the effect of a unit increase in exposure2 at the reference level of the CLASS variables.  And the effect of exposure2 at the level might be different and might or might not be significantly different from zero. Either parameterization is fine, you just have to be sure any statement of the results are consistent with the interpretation imposed by the parameterization that is used.

Case
Calcite | Level 5

Thank you so StatDave_sas for your help. I think this solved my problem.

And thank you  for your help.

 

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 3 replies
  • 19818 views
  • 4 likes
  • 3 in conversation