turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

Find a Community

- Home
- /
- Analytics
- /
- Stat Procs
- /
- Catergorical independent variable in proc MDC

Topic Options

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

01-11-2016 08:37 AM

Dear all,

i am trying to run a random utility model using proc mdc. The model runs well using a decision variable and a choice variable (2 choices). However, i do have an independent variable (categorical with 6 levels) which i would like to include in my model as a fixed effects. i would want to have the estimates of the fixed effect in addition to the interaction effect with the choice variable. This has so far been unsuccessful.

Does anyone have experience with such an issue?

Thanks in advance.

Osama

Accepted Solutions

Solution

01-12-2016
11:31 AM

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Osamaalmalik

01-12-2016 11:23 AM

Quasicomplete separation--I should have guessed once I saw the NPD warnings for the Hessian matrix. MDC won't work, GENMOD will get nasty (you may need to use EXACT methods), but treating center as a random effect in GLIMMIX ought to be tractable.

Steve Denham

All Replies

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Osamaalmalik

01-11-2016 08:59 AM

Can you share the code which isn't working? What you are trying to do should not be impossible (unless the dataset is too small).

Steve Denham

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to SteveDenham

01-11-2016 09:06 AM

Dear Steve,

See code. It is about the variable center.

PROC MDC DATA=CHOICE_DEF ;

CLASS CENTER;

MODEL CHOICE_PATIENT = CHOICE CENTER/ TYPE=CLOGIT NCHOICE=**2** ;

ID PATIENT;

RUN;

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Osamaalmalik

01-11-2016 09:28 AM

Not real sure now about the design. Are all patients seen at all centers? If not, then you have a nested logit, at least from the estimation point of view. If that is the case, then this gets beyond anything I have tried in MDC, although there are several examples in the documentation.

Ifall of the patients are seen at all centers, then adding the interaction choice*center to the model statement may (and I stress ** may**) give what you want.

Steve Denham

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to SteveDenham

01-11-2016 09:40 AM

Each center have 200 (different) patients, which makes patient nested within center for sure. But i am not sure that applies here, since nested logit works in case of decision trees with choices for the second level are nested within the levels of the first level, e.g choosing center and within center there would be different choices (treatment A and B in center 1, treatment C and D in center 2, etc.)

Center*choice worked for sure, it gives you an estimate for the choice per center. However, i want to estimate the center effect itself, in addition to the interaction effect.

In other procedures (in a hypothetical situation), e.g. GENMOD , this is done by: model choice_patient=center choice center*choice.

This approach, however, does not seem to work with MDC.

Any suggestions Steve? Thanks in advance.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Osamaalmalik

01-11-2016 12:45 PM

Again, I'm not sure, but with a binomial choice, it looks like the "interaction" term gives the probability of choosing the lower value for that center. The probability of choosing the upper value would be 1 minus the probability. What additional value for CENTER could you be looking for? If I understand that, then I might be of more help.

Steve Denham

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to SteveDenham

01-11-2016 12:59 PM

Dear Steve,

i am comparing the power of a number of models (including fixed effects logistic regression, random effects logistic regression and random utility model) using the LRT. All models test whether a treatment effect (=utility if you assume taking the treatment is a choice) depends on center (interaction effect). i try to "purify" the interaction effect by including an effect (fixed or random) for the center itself, in addition to the interaction term. Otherwise i am worried differences between centers (unrelated to treatment) might influence the power. Included is the code for the fixed effects logistic regression models to clarify the idea.

Thanks again in advance, your help is much appreciated.

ODS OUTPUT ModelFit=FIXED_MODEL_INTERACTION;

PROC GENMOD DATA = COMPLETE DESCENDING;

CLASS CENTER TREATMENT;

MODEL SUCCESS = CENTER TREATMENT CENTER*TREATMENT / DIST=BINOMIAL LINK=LOGIT TYPE3;

BY RUN;

RUN;

DATA FIXED_MODEL_INTERACTION_FINAL(KEEP=RUN LOG_LIK_INTERACTION);

SET FIXED_MODEL_INTERACTION;

WHERE Criterion = 'Log Likelihood';

RENAME VALUE = LOG_LIK_INTERACTION;

RUN;

ODS OUTPUT ModelFit=FIXED_MODEL_NO_INTERACTION;

PROC GENMOD DATA = COMPLETE DESCENDING;

CLASS CENTER TREATMENT;

MODEL SUCCESS = CENTER TREATMENT / DIST=BINOMIAL LINK=LOGIT TYPE3;

BY RUN;

RUN;

DATA FIXED_MODEL_NO_INTERACTION_FINAL(KEEP=RUN LOG_LIK_NO_INTERACTION);

SET FIXED_MODEL_NO_INTERACTION;

WHERE Criterion = 'Log Likelihood';

RENAME VALUE = LOG_LIK_NO_INTERACTION;

RUN;

DATA FIXED_MODEL_TEST;

MERGE FIXED_MODEL_INTERACTION_FINAL FIXED_MODEL_NO_INTERACTION_FINAL;

LRT_STATISTIC = -**2***LOG_LIK_NO_INTERACTION+**2***LOG_LIK_INTERACTION;

IF LRT_STATISTIC < **0** THEN P_VALUE_FIXED=**1**; ELSE P_VALUE_FIXED=**1**-PROBCHI(LRT_STATISTIC,&CENTERS-**1**);

IF P_VALUE_FIXED <= **0.05** THEN SIGNIFICANCE_FIXED_MODEL=**1**; ELSE SIGNIFICANCE_FIXED_MODEL = **0**;

RUN;

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Osamaalmalik

01-11-2016 02:52 PM

Hi Osama,

I will go back some now that I have seen that LRT for PROC GENMOD (although a Type 3 analysis should give the same result).

I believe you have run the following:

PROC MDC DATA=CHOICE_DEF ;

CLASS CENTER;

MODEL CHOICE_PATIENT = CHOICE CENTER CHOICE*CENTER/ TYPE=CLOGIT NCHOICE=**2** ;

ID PATIENT;

RUN;

but had some problem with:

PROC MDC DATA=CHOICE_DEF ;

CLASS CENTER;

MODEL CHOICE_PATIENT = CHOICE CENTER/ TYPE=CLOGIT NCHOICE=**2** ;

ID PATIENT;

RUN;

Were there any error messages in the log or notes in the output for the latter? If not, couldn't you just compare the AIC values of the two models? Or going slightly farther, use the AIC values to calculate the information loss between the two models, to calculate the relative likelihood of the models.

If there were errors/warnings/notes, what did they say?

Steve Denham

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Osamaalmalik

01-11-2016 04:05 PM

Dear Steve,

No. The only codes that worked for me are the following two:

PROC MDC DATA=CHOICE_DEF ;

CLASS CENTER;

MODEL CHOICE_PATIENT = CHOICE / TYPE=CLOGIT NCHOICE=**2** ;

ID PATIENT;

RUN;

PROC MDC DATA=CHOICE_DEF ;

CLASS CENTER;

MODEL CHOICE_PATIENT = CHOICE*CENTER / TYPE=CLOGIT NCHOICE=**2** ;

ID PATIENT;

RUN;

DO YOU THINK USING THE LRT AFTER APPLYING THE ABOVE MENTIONED CODES IS EQUIVELANT TO THE LRT OF THE OF THE GENMOD CODES? OR IS APPLES AND ORANGES?

*IF I APPLY THE FOLLOWING CODE:

PROC MDC DATA=CHOICE_DEF ;

CLASS CENTER;

MODEL CHOICE_PATIENT = CENTER CHOICE / TYPE=CLOGIT NCHOICE=**2** ;

ID PATIENT;

RUN;

I GET THE NOTE:"

WARNING: Some explanatory variables are constant for all IDs across all primitive alternatives.

WARNING: The Hessian matrix is singular."

*WHEN I APPLY THE FOLLOWIN CODE:

PROC MDC DATA=CHOICE_DEF ;

CLASS CENTER;

MODEL CHOICE_PATIENT = CENTER CHOICE CHOICE*CENTER / TYPE=CLOGIT NCHOICE=**2** ;

ID PATIENT;

RUN;

I GET THE NOTE:"WARNING: Some explanatory variables are constant for all IDs across all primitive alternatives.

WARNING: The model contains a RESTRICT or BOUNDS statement. The resulting goodness-of-fit measures may be misleading.

WARNING: The Hessian matrix is singular."

I DO FEEL IT MIGHT NOT BE POSSIBLE SINCE ALL THE EXAMPLES IN THE SAS DOCUMENTATION INVOLVE CONTINUOUS INDEPENDENT VARIABLES.

YOUR HELP IS MUCH APPRECIATED.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Osamaalmalik

01-12-2016 10:54 AM

Those warning messages are most likely due to overspecification of the model, and mean that everything, except perhaps point estimates, is probably suspect.

I would say that the LRT for the MDC codes is as close as you will get to the GENMOD test, given that the models are parameterized differently in the two PROCs.

I still would look at the AIC values to see how much of the information is retained by moving to the simpler model, since these all seem to be nested.

Steve Denham

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to SteveDenham

01-12-2016 11:20 AM

Dear Steve,

Thanks again for the effort.

i think the warning is about that all patients in center i have the same value for column center, namely i. Proc MDC's problem here is that the response (choice) has to have a different value for the explanotary variable, for each of the choices. The dataset in question does not have the format of a normal utility model in which each value of the response choice (e.g. choice between two commodities) has it is own value for the explanatory variable (2 different prices). Here every patient have for both choices (treatment/no treatment) the same value for explanatory variable center, namely i.

I believe this issue is unsolvable. I would like to thank you very much anyway.

Solution

01-12-2016
11:31 AM

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Osamaalmalik

01-12-2016 11:23 AM

Quasicomplete separation--I should have guessed once I saw the NPD warnings for the Hessian matrix. MDC won't work, GENMOD will get nasty (you may need to use EXACT methods), but treating center as a random effect in GLIMMIX ought to be tractable.

Steve Denham

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to SteveDenham

01-12-2016 11:31 AM

Dear Steve,

i figured as much.

GENMOD is doing well. I also did a random effects logistic regression model (center: random effect) using NLMIXED, doing ok (non-convergence around 5%).

i guess i now have to decide whether the LRT test resulting from the programmable proc MDC's is theoritically comparable to the LRT from GENMOD and NLMIXED.

Much appreciation.