BookmarkSubscribeRSS Feed
SBS
Obsidian | Level 7 SBS
Obsidian | Level 7

I am very new to SAS and need some help fitting a poisson model first, and in case of overdispersion, fitting a negative binomial mixed model for the following context: 

 

I want to isolate gender differences in the preferences for reports written by others using language that is high on several pre-determined categories (e.g., personal pronouns). I have raw counts of words in the report that fall into each of the various language categories as well as total number of words in the report. 

 

I understand that I have to use mixed Poisson/Negative binomial regression models as in my data - (1) I have multiple reports bought by the same person; and (2) the same report may be bought by multiple people;

 

I also need to control for the random effects of two categorical variables: country and industry. 

 

How do I fit a mixed model for this situation with a count dependent variable (raw count of personal pronouns in the report), offset parameter (total number of words in the report), gender of the rater (m=0; f=1); and four nestings (raterID; reportID; countryCode; industryCode)? How do I know if Poisson or negative binomial mixed model fits better based on regression results?

 

I understand that I need to use nlmixed. But I don't understand how to use it for my situation.I would really appreciate your kind help.  

6 REPLIES 6
PGStats
Opal | Level 21

Unless you are looking at non-linear models, you should be looking at proc glimmix, it does support Poisson and negative binomial distributed responses with offset.

PG
SBS
Obsidian | Level 7 SBS
Obsidian | Level 7

I tried the following code with GLIMMIX

 

proc glimmix data=myData method=quad;
class GenderRater RaterID;
model RAW_COUNT_PRONOUNS =GenderRater / link=log s dist=negbinomial offset=log_TOTAL_WORDS;
random int / subject=RaterID;
run;

 

Within a second it gives the following error: The SAS system stopped processing this step because of insufficient memory

 

I just have 50,000 observations and 1500 Raters. What am I getting wrong?

 

SBS
Obsidian | Level 7 SBS
Obsidian | Level 7

I also tried NLMIXED and it gives no results, despite no error - 

 

proc nlmixed data=myData;
xb = b0 + b1*GenderRater + u;
mu = exp(xb +log_TOTAL_WORDS);
m = 1/alpha;
ll = lgamma(RAW_COUNT_PRONOUNS+m)-lgamma(RAW_COUNT_PRONOUNS+1)-lgamma(m)
+RAW_COUNT_PRONOUNS*log(alpha*mu)-(RAW_COUNT_PRONOUNS+m)*log(1+alpha*mu);
model RAW_COUNT_PRONOUNS ~ general(ll);
random u ~ normal(0,s2u) subject=RaterID;

run;

SBS
Obsidian | Level 7 SBS
Obsidian | Level 7

I was able to get NLMIXED running. The error was due to gender variable. It worked after dummy coding it. However, GLIMMIX still gives the same memory error.

SBS
Obsidian | Level 7 SBS
Obsidian | Level 7

Unfortunately, even the method=quad(fastquad qpoints=3) option gives the same memory error.

 

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 6 replies
  • 1491 views
  • 2 likes
  • 2 in conversation