Hi, I am trying to run an ANOVA using Proc Glimmix with an unbalanced dataset. About 30% of my data are missing and I think it is causing severe underdispersion. My generalized chi square/DF value is 0.04.
This is my code:
proc glimmix;
class Coll Gen Rep;
model SG = Coll | Gen ;
random Rep;
lsmeans Gen/ pdiff adjust=tukey;
ods output lsmeans diffs;
ods output lsmeans=mmm diffs=ppp;
run;
%include 'C:\Users\uqkhodg1\Desktop\School-related\sas-macros\pdmix800.sas';
%pdmix800(ppp,mmm,alpha=0.05, sort=no);
run;
Any suggestions for dealing with the missing data?
The first step with missing data is to determine (as best you can) whether the data are MCAR, MAR or MNAR.
MCAR stands for missing completely at random. This means that there is no particular reason why some data are missing. Maybe the hard disk crashed, or some responses (at random) were lost or something like that.
MAR stands for missing at random. This means that there may be reasons for the missingness, but that you can model those reasons using data that you actually have.
MNAR means missing not at random (also known as nonignorable nonresponse). That's when neither of the above are true.
Unfortunately, there's no test for this - you have to figure it out, based on logic and what you know.
For MCAR, you don't have to do anything. The only issue will be a loss of power. Estimates will be unbiased and so on.
For MAR and MNAR you can use PROC MI and PROC MIANALYZE to do multiple imputation of the missing data. PROC MI is pretty complicated and the choices aren't always obvious. You may want to consult with an expert.
The first step with missing data is to determine (as best you can) whether the data are MCAR, MAR or MNAR.
MCAR stands for missing completely at random. This means that there is no particular reason why some data are missing. Maybe the hard disk crashed, or some responses (at random) were lost or something like that.
MAR stands for missing at random. This means that there may be reasons for the missingness, but that you can model those reasons using data that you actually have.
MNAR means missing not at random (also known as nonignorable nonresponse). That's when neither of the above are true.
Unfortunately, there's no test for this - you have to figure it out, based on logic and what you know.
For MCAR, you don't have to do anything. The only issue will be a loss of power. Estimates will be unbiased and so on.
For MAR and MNAR you can use PROC MI and PROC MIANALYZE to do multiple imputation of the missing data. PROC MI is pretty complicated and the choices aren't always obvious. You may want to consult with an expert.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.
Find more tutorials on the SAS Users YouTube channel.