BookmarkSubscribeRSS Feed
Demographer
Pyrite | Level 9

Hi,

I want to estimate the impact of the number of years of schooling (yrschool, continuous variable, no negative)  on the salary (incwage, also continuous variable with no negative), controlling for age. The dataset included several surveys from different countries and different years (surveys are identified by the variable "sample"). I thus think the best model would be a Poisson regression with random intercept.

According to SAS Guide, I should use the BGLIMM Procedure:

http://documentation.sas.com/doc/en/pgmsascdc/9.4_3.4/statug/statug_bglimm_examples03.htm

 

My code is:

 

proc bglimm data=test.reduced3 seed=10571042 nmc=10000;
   class sample;
   model incwage = yrschool age   / dist=poisson;
   random int / sub=sample;
run;

However I get the following error message: " ERROR: PROC BGLIMM failed to generate samples from the posterior distribution."

How can I fix this?

With the same dataset, I can run a normal Poisson model with no random intercept with proc genmod.

5 REPLIES 5
SteveDenham
Jade | Level 19

You may need an offset (see Example 3 in the BGLIMM documentation). I would also recommend including an outpost= option in the PROC BGLIMM statement. The error message could be caused by having too many levels of sample as compared to the number of observations. Have you tried fitting the fixed effects model in BGLIMM rather than GENMOD? Or tried a frequentist approach using PROC GLIMMIX? The latter may have better diagnostics for what is going on.

 

SteveDenham

Demographer
Pyrite | Level 9
Thanks for your answer. I tried to add add the outpost option but I get the same error message. As for the offset, I'm not so sure if this is suitable for my model. My understanding is that the offset statement is used to transform the count into a rate. However my dependent variable is the wage (the dataset is at the individual level), so I don't see what variable could be used in the offset statement. The variable sample has 8 categories (so 8 different surveys), while the total number of cases is above 3M. The model with fixed effects works fine with BGLIMM.Not sure what is this frequentist approach. I'll investigate this option.
SteveDenham
Jade | Level 19

Wage doesn't seem to me like a variable that would follow a Poisson distribution as it usually is not a discrete count. Since it is considered continuous, you might want to try an exponential distribution or a gamma distribution here and in GENMOD.

 

SteveDenham

Demographer
Pyrite | Level 9

Thanks for the advice. You are right, after checking, the variable follows a gamma distribution. However, it seems Proc GENMOD does not allow random effects. Are there any other procedures to make gamma-regression model with random effect (or any models that can handle simultaneously data from multiple surveys on multiple countries)?

StatDave
SAS Super FREQ

You can model a gamma distributed response including random effects in PROC GLIMMIX or PROC NLMIXED. The log link function is generally used. If you model on the log scale is linear, then use GLIMMIX; if nonlinear, use NLMIXED.

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 5 replies
  • 511 views
  • 1 like
  • 3 in conversation