BookmarkSubscribeRSS Feed
Demographer
Pyrite | Level 9

Hi,

I want to build a model predicting the occupation in 4 categories (occ_reduced: high skilled, medium-skilled, low-skilled, unemployed) with individuals characteristics (edu, sex, agegr) and the demand for High-skilled jobs (norm_demH ) and for low and medium-skilled jobs (norm_demML), which are year- and country-specific variables. The dataset combines surveys from different countries and different years. Each survey is identified by the variable "strate".  

I first built a multiniomal logit model with proc logistic that gives consistent parameters:

proc logistic data=lab.sample outmodel=lab.occup_model2;
class edu(ref='2') imm_var(ref='0') sex / param = ref;
model occ_reduced(ref='UNEM')=edu sex agegr norm_demH norm_demML / link = glogit;
weight weight_samp /norm;
where labour=1;
run;

However I feel I should use a multilevel model since norm_demH and norm_demML are survey-specific variables. I tried several options in proc glimmix but the model never not converge. The last one looks like this:

 

proc glimmix data=lab.sample INITGLM;
class occ_reduced agegr SEX strate ;
model occ_reduced(ref='UNEM') =edu sex agegr norm_demH norm_demML/ solution dist=MULTINOMIAL link=glogit;
random intercept / subject=strate GROUP=occ_reduced;
weight weight_samp;
where labour=1;
NLOPTIONS TECH=NRRIDG MAXITER=100 ;
run;

I'm not so used to this kind of multilevel model so maybe there is something I'm missing. Any tips?

17 REPLIES 17
jiltao
SAS Super FREQ

What happens if you add METHOD=QUAD or METHOD=LAPLACE option in the PROC GLIMMIX statement?

If that does not help, what non-converging messages did you get previously and after adding one of these options?

Thanks,

Jill

Demographer
Pyrite | Level 9

I get this error message.

118  proc glimmix data=lab.sample METHOD=QUAD;
119  title "Random intercept";
120  class occ_reduced agegr SEX strate  ;
121  model occ_reduced(ref='UNEM') =edu sex agegr rdemH rdemM rdemL/ solution dist=MULTINOMIAL
121! link=glogit;
122  random intercept / subject=strate GROUP=occ_reduced;
123  weight weight_samp;
124  where labour=1;
125  run;



NOTE: Some observations are not used in the analysis because of: zero or negative weight (n=112144),
      missing weight (n=10511).
NOTE: PROC GLIMMIX is fitting a model for nominal (unordered) data. This type of model contrasts each
      response level against a reference level (occ_reduced='UNEM').
ERROR: Infeasible parameter values for evaluation of objective function with 1 quadrature point.
NOTE: PROCEDURE GLIMMIX used (Total process time):
      real time           2:52.39
      cpu time            2:50.72


ballardw
Super User

TIP: Include the entire LOG the attempt. Often SAS provides information that will let someone familiar with the procedure point in a direction for resolution. Copy the entire text from the LOG of the code and all the notes, messages, warnings and details. Then on the forum open a text box using the </> icon that appears above the message window and paste the text.

 

The text box will preserve the formatting of the text from the log and visually set the details apart from the discussion or question and answer text. Note: sometimes we find code in the LOG not to be the same as shared with the problem description which is why we ask for the entire code from the log.

Demographer
Pyrite | Level 9
110  proc glimmix data=lab.sample;
111  title "Random intercept";
112  class occ_reduced agegr SEX strate  ;
113  model occ_reduced(ref='UNEM') =edu sex agegr rdemH rdemM rdemL/ solution dist=MULTINOMIAL
113! link=glogit;
114  random intercept / subject=strate GROUP=occ_reduced;
115  weight weight_samp;
116  where labour=1;
117  run;



NOTE: Some observations are not used in the analysis because of: zero or negative weight (n=112144),
      missing weight (n=10511).
NOTE: PROC GLIMMIX is fitting a model for nominal (unordered) data. This type of model contrasts each
      response level against a reference level (occ_reduced='UNEM').
NOTE: Did not converge.
NOTE: PROCEDURE GLIMMIX used (Total process time):
      real time           25:38.21
      cpu time            25:37.26

Season
Lapis Lazuli | Level 10

What is the sample size of your entire data? I saw plenty of observations excluded from analysis due to zero, negative or missing weights. You could alternatively provide the size of the sample eventually used by the GLIMMIX procedure.

Demographer
Pyrite | Level 9

The sample size of valid observations is 1,156,858. 

 

 

Number of Observations Read 1279513
Number of Observations Used 1156858

Response Profile
Ordered
Value
occ_reduced Total
Frequency
1 HIGH 418183
2 LOW 95326
3 MED 553169
4 UNEM 90180
In modeling category probabilities, occ_reduced='UNEM' serves as the reference category.

Dimensions
G-side Cov. Parameters 3
Columns in X 57
Columns in Z per Subject 3
Subjects (Blocks in V) 189
Max Obs per Subject 27313

Optimization Information
Optimization Technique Dual Quasi-Newton
Parameters in Optimization 54
Lower Boundaries 3
Upper Boundaries 0
Fixed Effects Not Profiled
Starting From GLM estimates

The initial estimates did not yield a valid objective function.
jiltao
SAS Super FREQ

You might add PARMS statement to provide your own starting values for the covariance parameter estimates. It might take several trial and error.... 

Season
Lapis Lazuli | Level 10

I am also curious on two issues of provision of starting values by the user.

(1) I remember that maximum likelihood is the method used here. Theoretically, if identifiability holds, then the maximum likelihood estimator should be unique. So is it useful to try different starting values?

(2) I had personally tried providing starting values in the NLMIXED procedure and had noted the effect of the provision of starting values. This could have a bearing on the ultimate parameter estimates (contradicting the theoretical result I stated in the last paragraph, which is also something about which I am puzzled). So would not it be subjective if we arbitrarily provide starting values? 

Demographer
Pyrite | Level 9

I tried to add parms but still no convergence. 

 


135  proc glimmix data=lab.sample;
136  class occ_reduced agegr SEX strate  ;
137  model occ_reduced(ref='UNEM') =edu sex agegr rdemH rdemM rdemL/ solution dist=MULTINOMIAL
137! link=glogit;
138  random intercept / subject=strate GROUP=occ_reduced;
139  weight weight_samp;
140  where labour=1;
141  parms;
142  run;



NOTE: Some observations are not used in the analysis because of: zero or negative weight (n=112144),
      missing weight (n=10511).
NOTE: PROC GLIMMIX is fitting a model for nominal (unordered) data. This type of model contrasts each
      response level against a reference level (occ_reduced='UNEM').
NOTE: Did not converge.
NOTE: PROCEDURE GLIMMIX used (Total process time):
      real time           25:50.28
      cpu time            25:47.21

Season
Lapis Lazuli | Level 10
The PARMS statement is used for providing starting values of parameters for iterations that estimate the parameters (e.g., regression coefficients). With no starting values provided, the simple inclusion of "PARMS;" in the code does not help.
Demographer
Pyrite | Level 9

How choosing the starting value? Should I just pick a random number?

Season
Lapis Lazuli | Level 10

As @jiltao has mentioned, it (or maybe them) is (are) provided with trial and error. I comprehend it as providing it (them) arbitrarily.

If the current step of providing starting values is among one of the many steps in your entire analytic process and that the starting values to be provided can be estimated from the preceding step, then you can set values of the estimated parameter obtained in the preceding step as the starting values of the current step. This is a practice adopted in Amazon.com: SAS for Mixed Models, Second Edition: 9781590475003: Littell Ph.D., Ramon C., Milliken P....

jiltao
SAS Super FREQ

sample syntax:

parms (2) (3) (0.5) (1);

But the values would depend on your data so you need to make appropriate changes.

One approach is to fit a simpler model and hopefully it will converge. Then you can use the estimated values as the starting values for your model. For example,

proc glimmix data=lab.sample method=laplace;
 class occ_reduced  ;
  model occ_reduced(ref='UNEM') =/ solution dist=MULTINOMIAL
 link=glogit;
 random intercept / subject=strate GROUP=occ_reduced;
 where labour=1;
run;

Thanks,

Jill

Demographer
Pyrite | Level 9

Thanks for the advice. I tried a model with no covariate, but it still does not converge. 

 


183  proc glimmix data=lab.sample;
184  class occ_reduced agegr SEX strate  ;
185  model occ_reduced(ref='UNEM') =/*edu sex agegr rdemH rdemM rdemL*// solution dist=MULTINOMIAL
185! link=glogit;
186  random intercept / subject=strate GROUP=occ_reduced;
187  weight weight_samp;
188  where labour=1;
189  parms /*(2) (3) (0.5) (1)*/;
190  run;



NOTE: Some observations are not used in the analysis because of: zero or negative weight (n=112144),
      missing weight (n=10511).
NOTE: PROC GLIMMIX is fitting a model for nominal (unordered) data. This type of model contrasts each
      response level against a reference level (occ_reduced='UNEM').
WARNING: Pseudo-likelihood update fails in outer iteration 3.
NOTE: Did not converge.
NOTE: PROCEDURE GLIMMIX used (Total process time):
      real time           1:54.29
      cpu time            1:51.84

SAS Innovate 2025: Register Now

Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 17 replies
  • 1935 views
  • 2 likes
  • 4 in conversation