turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

Find a Community

- Home
- /
- Analytics
- /
- Stat Procs
- /
- GLIMMIX for multilevel multinomial logistic regres...

Topic Options

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

12-08-2016 11:14 AM

Dear all,

I'm a student and I want to** modelize migrations from individual datas.** Because I have many municipals datas, I want to perform a multilevel analysis, with only the intercept as random effect. My variable to predict is multinomial (not ordinal) and has 3 categories :

0 : no migration (reference)

1 : short migration (less than 40km)

2 : long migration (more than 40km)

So I'm trying to use the proc GLIMMIX but all the parameters are confusing and I dind't find a exemple for multinomial datas.

**Could you help me to select the right syntax** **?**

By example for the empty model I use this syntax :

proc glimmix data=Mob_06.Datas method=LAPLACE NOCLPRINT; class DCRAN Migration; model Migration (ref=first) = /CL link=glogit dist=MULTINOMIAL solution; RANDOM intercept/SUBJECT=DCRAN GROUP=Migration TYPE=VC SOLUTION CL; COVTEST / WALD; run;

Migration is the Y variable, DCRAN is the municipal code.

I'm not sure that Migration must be put in the class statment, but otherwise, the model fail with this error :

"Model is too large to be fit by PROC GLIMMIX in a reasonable amount of time on this

system. Consider changing your model"

thank you for your help

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

12-08-2016 06:49 PM

my advice would be to use proc sql to generate a unique list of municipalities, then use surveyselect with method=srs to select a much smaller random sample of those, then proc sql again to do an inner join of the resuling municipality sample with your original data. Run your model on that sample. Keep taking smaller or larger samples until you find the tipping point for the error. The model then might then be your stopping point, or you can then allow you to usefully investigate other approaches that give you equivalent results that are not so memory hungry.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

12-09-2016 10:52 AM

First, I believe your multinomial response **is** ordinal. Consider that it will be generated by the following:

if distance_migrated = 0 then migration=0;

if 0<distance_migrated<=40 then migration=1;

if distance_migrated>40 then migration=2;

Consequently, you could then change the link from glogit to cumlogit, which would go a long ways towards reducing the model size and memory requirements.

But why categorize the response variable? You will always lose some power by categorizing the response variable (see Frank Harrell's website for more on this http://biostat.mc.vanderbilt.edu/wiki/Main/CatContinuous).

If you fit a continuous model (such as a spline) with an appropriate distribution, I believe your results will be more interpretable, more powerful and much more precise.

Steve Denham