BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Osamaalmalik
Fluorite | Level 6

Dear all,

i am trying to run a random utility model using proc mdc. The model runs well using a decision variable and a choice variable (2 choices). However, i do have an independent variable (categorical with 6 levels) which i would like to include in my model as a fixed effects. i would want to have the estimates of the fixed effect in addition to the interaction effect with the choice variable. This has so far been unsuccessful.

Does anyone have experience with such an issue?

Thanks in advance.

Osama

1 ACCEPTED SOLUTION

Accepted Solutions
SteveDenham
Jade | Level 19

Quasicomplete separation--I should have guessed once I saw the NPD warnings for the Hessian matrix.  MDC won't work, GENMOD will get nasty (you may need to use EXACT methods), but treating center as a random effect in GLIMMIX ought to be tractable.

 

Steve Denham

View solution in original post

12 REPLIES 12
SteveDenham
Jade | Level 19

Can you share the code which isn't working?  What you are trying to do should not be impossible (unless the dataset is too small).  

 

Steve Denham

Osamaalmalik
Fluorite | Level 6

Dear Steve,

See code. It is about the variable center.

 

PROC MDC DATA=CHOICE_DEF ;

CLASS CENTER;

MODEL CHOICE_PATIENT = CHOICE CENTER/ TYPE=CLOGIT NCHOICE=2 ;

ID PATIENT;

RUN;

 

SteveDenham
Jade | Level 19

Not real sure now about the design.  Are all patients seen at all centers?  If not, then you have a nested logit, at least from the estimation point of view.  If that is the case, then this gets beyond anything I have tried in MDC, although there are several examples in the documentation.

 

Ifall of  the patients are seen at all centers, then adding the interaction choice*center to the model statement may  (and I stress may) give what you want.

 

Steve Denham

Osamaalmalik
Fluorite | Level 6

Each center have 200 (different) patients, which makes patient nested within center for sure. But i am not sure that applies here, since nested logit works in case of decision trees with choices for the second level are nested within the levels of the first level, e.g choosing center and within center there would be different choices (treatment A and B in center 1, treatment C and D in center 2, etc.)

Center*choice worked for sure, it gives you an estimate for the choice per center. However, i want to estimate the center effect itself, in addition to the interaction effect.

In other procedures (in a hypothetical situation), e.g. GENMOD , this is done by: model choice_patient=center choice center*choice.

This approach, however, does not seem to work with MDC.

Any suggestions Steve? Thanks in advance.

 

 

SteveDenham
Jade | Level 19

Again, I'm not sure, but with a binomial choice, it looks like the "interaction" term gives the probability of choosing the lower value for that center.  The probability of choosing the upper value would be 1 minus the probability.  What additional value for CENTER could you be looking for?  If I understand that, then I might be of more help.

 

Steve Denham

Osamaalmalik
Fluorite | Level 6

Dear Steve,

i am comparing the power of a number of models (including fixed effects logistic regression, random effects logistic regression and random utility model) using the LRT. All models test whether a treatment effect (=utility if you assume taking the treatment is a choice) depends on center (interaction effect). i try to "purify" the interaction effect by including an effect (fixed or random) for the center itself, in addition to the interaction term. Otherwise i am worried differences between centers (unrelated to treatment) might influence the power. Included is the code for the fixed effects logistic regression models to clarify the idea.

Thanks again in advance, your help is much appreciated.

 

ODS OUTPUT ModelFit=FIXED_MODEL_INTERACTION;

PROC GENMOD DATA = COMPLETE DESCENDING;

CLASS CENTER TREATMENT;

MODEL SUCCESS = CENTER TREATMENT CENTER*TREATMENT / DIST=BINOMIAL LINK=LOGIT TYPE3;

BY RUN;

RUN;

DATA FIXED_MODEL_INTERACTION_FINAL(KEEP=RUN LOG_LIK_INTERACTION);

SET FIXED_MODEL_INTERACTION;

WHERE Criterion = 'Log Likelihood';

RENAME VALUE = LOG_LIK_INTERACTION;

RUN;

ODS OUTPUT ModelFit=FIXED_MODEL_NO_INTERACTION;

PROC GENMOD DATA = COMPLETE DESCENDING;

CLASS CENTER TREATMENT;

MODEL SUCCESS = CENTER TREATMENT / DIST=BINOMIAL LINK=LOGIT TYPE3;

BY RUN;

RUN;

DATA FIXED_MODEL_NO_INTERACTION_FINAL(KEEP=RUN LOG_LIK_NO_INTERACTION);

SET FIXED_MODEL_NO_INTERACTION;

WHERE Criterion = 'Log Likelihood';

RENAME VALUE = LOG_LIK_NO_INTERACTION;

RUN;

DATA FIXED_MODEL_TEST;

MERGE FIXED_MODEL_INTERACTION_FINAL FIXED_MODEL_NO_INTERACTION_FINAL;

LRT_STATISTIC = -2*LOG_LIK_NO_INTERACTION+2*LOG_LIK_INTERACTION;

IF LRT_STATISTIC < 0 THEN P_VALUE_FIXED=1; ELSE P_VALUE_FIXED=1-PROBCHI(LRT_STATISTIC,&CENTERS-1);

IF P_VALUE_FIXED <= 0.05 THEN SIGNIFICANCE_FIXED_MODEL=1; ELSE SIGNIFICANCE_FIXED_MODEL = 0;

RUN;

 

SteveDenham
Jade | Level 19

Hi Osama,

 

I will go back some now that I have seen that LRT for PROC GENMOD (although a Type 3 analysis should give the same result).

 

I believe you have run the following:

 

PROC MDC DATA=CHOICE_DEF ;

CLASS CENTER;

MODEL CHOICE_PATIENT = CHOICE CENTER CHOICE*CENTER/ TYPE=CLOGIT NCHOICE=2 ;

ID PATIENT;

RUN;

 

but had some problem with:

 

PROC MDC DATA=CHOICE_DEF ;

CLASS CENTER;

MODEL CHOICE_PATIENT = CHOICE CENTER/ TYPE=CLOGIT NCHOICE=2 ;

ID PATIENT;

RUN;

 

Were there any error messages in the log or notes in the output for the latter?  If not, couldn't you just compare the AIC values of the two models?  Or going slightly farther, use the AIC values to calculate the information loss between the two models, to calculate the relative likelihood of the models.

 

If there were errors/warnings/notes, what did they say?

 

Steve Denham

Osamaalmalik
Fluorite | Level 6

Dear Steve,

No. The only codes that worked for me are the following two:

 

PROC MDC DATA=CHOICE_DEF ;

CLASS CENTER;

MODEL CHOICE_PATIENT = CHOICE / TYPE=CLOGIT NCHOICE=2 ;

ID PATIENT;

RUN;

 

 

PROC MDC DATA=CHOICE_DEF ;

CLASS CENTER;

MODEL CHOICE_PATIENT = CHOICE*CENTER / TYPE=CLOGIT NCHOICE=2 ;

ID PATIENT;

RUN;

 

DO YOU THINK USING THE LRT AFTER APPLYING THE ABOVE MENTIONED CODES IS EQUIVELANT TO THE LRT OF THE OF THE GENMOD CODES? OR IS APPLES AND ORANGES?

 

*IF I APPLY THE FOLLOWING CODE:

 

PROC MDC DATA=CHOICE_DEF ;

CLASS CENTER;

MODEL CHOICE_PATIENT = CENTER CHOICE / TYPE=CLOGIT NCHOICE=2 ;

ID PATIENT;

RUN;

 

 

I GET THE NOTE:"

WARNING: Some explanatory variables are constant for all IDs across all primitive alternatives.

WARNING: The Hessian matrix is singular."

 

*WHEN I APPLY THE FOLLOWIN CODE:

 

PROC MDC DATA=CHOICE_DEF ;

CLASS CENTER;

MODEL CHOICE_PATIENT = CENTER CHOICE CHOICE*CENTER / TYPE=CLOGIT NCHOICE=2 ;

ID PATIENT;

RUN;

 

I GET THE NOTE:"WARNING: Some explanatory variables are constant for all IDs across all primitive alternatives.

WARNING: The model contains a RESTRICT or BOUNDS statement. The resulting goodness-of-fit measures may be misleading.

WARNING: The Hessian matrix is singular."

 

I DO FEEL IT MIGHT NOT BE POSSIBLE SINCE ALL THE EXAMPLES IN THE SAS DOCUMENTATION INVOLVE CONTINUOUS INDEPENDENT VARIABLES.

YOUR HELP IS MUCH APPRECIATED.

 

 

SteveDenham
Jade | Level 19

Those warning messages are most likely due to overspecification of the model, and mean that everything, except perhaps point estimates, is probably suspect.

 

I would say that the LRT for the MDC codes is as close as you will get to the GENMOD test, given that the models are parameterized differently in the two PROCs.

 

I still would look at the AIC values to see how much of the information is retained by moving to the simpler model, since these all seem to be nested.

 

Steve Denham

 

 

Osamaalmalik
Fluorite | Level 6

Dear Steve,

Thanks again for the effort.

i think the warning is about that all patients in center i have the same value for column center, namely i. Proc MDC's problem here is that the response (choice) has to have a different value for the explanotary variable, for each of the choices. The dataset in question does not have the format of a normal utility model in which each value of the response choice (e.g. choice between two commodities) has it is own value for the explanatory variable (2 different prices). Here every patient have for both choices (treatment/no treatment) the same value for explanatory variable center, namely i.

I believe this issue is unsolvable. I would like to thank you very much anyway.

SteveDenham
Jade | Level 19

Quasicomplete separation--I should have guessed once I saw the NPD warnings for the Hessian matrix.  MDC won't work, GENMOD will get nasty (you may need to use EXACT methods), but treating center as a random effect in GLIMMIX ought to be tractable.

 

Steve Denham

Osamaalmalik
Fluorite | Level 6

Dear Steve,

i figured as much.

GENMOD is doing well. I also did a random effects logistic regression model  (center: random effect) using NLMIXED, doing ok (non-convergence around 5%).

i guess i now have to decide whether the LRT test resulting from the programmable proc MDC's is theoritically comparable to the LRT from GENMOD and NLMIXED.

Much appreciation.

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 12 replies
  • 2318 views
  • 1 like
  • 2 in conversation