- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
HI all,
I'm new to the forums and beginner-moderate in SAS (v 9.4). I've been running Proc Genmod with a Poisson distribution for my outcome which is number of word pairs remembered (a memory study). I've run into several issues for which I would appreciate some guidance. The design is within-subjects and the factors I'm including are condition, order (i.e., which condition was administered first), condition*order (all fixed) and subject (repeated statement). I decided to use an iterative approach where I started with a full model with all factors, and then would drop order*condition, then order (based on the model fit statistics). I encountered the following errors:
(1) When I run a model with the following code:
proc genmod data=Aftern plots=none;
where condition ~='First' and values_relatedness='unrelated';
class Condition Subject_ID_byDate Order;
model Number_correct=Condition Order Condition*Order / dist=poisson offset=night_number_words;
repeated subject=Subject_ID_byDate / TYPE=AR(1);run;
I get an error in the log (see attached screenshot 1) saying
ERROR: Error in parameter estimate covariance computation.
ERROR: Error in estimation routine.
I also get a blank model.
I read online that there may not be sufficient variation among subjects to allow for a random intercept, so I removed the repeated statement and it ran. However, I encountered another error where the effect (chi-square and p-value) for condition*order was blank (see below).
I then removed the "condition*order" effect and it ran fine (again without the repeated statement), but my p-value for effect of condition was very different (see below)!
Then I tried re-introducing the repeated statement (with only condition and order, no condition*order) and it ran fine here too! Except now the parameter estimates indicate "Analysis Of GEE Parameter Estimates" and the fit criteria are QIC and QICu instead of AIC, etc (see screenshot below) where without the repeated statement, the output produces "Criteria For Assessing Goodness Of Fit" (with AIC etc.) and "Analysis Of Maximum Likelihood Parameter Estimates."
My main questions are (1) should I remove the repeated statement OR the condition*order to allow the model to run? Which is more appropriate or does it depend on the research questions/design? (2) Why is "condition* order" blank in the second screenshot - can I trust this model, especially as removing condition*order drastically changes the effect of condition in the third screenshot? (3) What is the difference between GEE and maximum likelihood, and how can I compare model fit across these models if the model fit statistics (e.g., QIC vs AICC) are different (e.g. third versus fourth screenshot)? (I know that for example proc mixed produces AICC whether the random intercept is present or not, allowing for model comparison).
Your help is much appreciated and greatly needed.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
EDIT: Sorry... forgot to link to the note on separation. Is added below.
Note that with the REPEATED statement, the model is not a random effects model. There are no random intercepts. Since the GEE model obtained with the REPEATED statement is not a likelihood-based method, the usual AIC and BIC statistics are not possible. The QIC statistic is an analogous statistic developed for the GEE model. The GEE algorithm is described in the Details section of the GENMOD documentation.
If your data consist of a number remembered out of some number of trials for each subject, then the data are binomial. If there is only one set of trials per subject then you don't need the REPEATED statement and you could fit a logistic model using the number of trials variable in the events/trials syntax:
proc logistic;
class condition order / param=glm;
model morn_vs_night_number_correct/num_trials = condition order condition*order;
run;
If there are multiple sets of trials per subject, then you would still need the REPEATED statement:
proc genmod;
class condition order;
model morn_vs_night_number_correct/num_trials = condition order condition*order / dist=bin;
repeated subject=subject_id;
run;
If the first model can be used and PROC LOGISTIC reports a "separation" condition, then the data are probably too sparse for the model. Similarly, sparseness could be the problem if the second model is needed and you get errors like you mentioned. See this note concerning sparseness and separation.
- Tags:
- separation
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Thanks for the reply - the design is repeated measures (each subject experiences a condition) so a repeated statement would be warranted (if sparseness is not an issue!) However, I don't see a note concerning sparseness and separation?
In addition to the missing sparseness/separation note - a couple of followup-questions:
(1) If AIC etc. are not available with the repeated statement in PROC GENMOD, is it possible to get QIC without the repeated statement? How else will I compare fit across models?
(2) Do you know why the effects (chi-square and p-value) for order*condition are blank in the second screenshot I showed under "analysis of maximum likelihood parameter estimates"?
Thanks.