BookmarkSubscribeRSS Feed
RubenPnx
Calcite | Level 5

Hi all,

I am trying to model the some data in SAS using proc Glimmix and I don't know how to model and code it correctly.

My data are from published studies with longitudinal measures of the percentage of breeding population, having data of different species and populations of these. So the repeated measures are not taken at the same time neither they have the same amount of data.

I would like to study the tend of the breeding behaviour of all the species along the different seasons of the year.

So I have though to incorporate:

1) Season as fixed factor

2) Population as categorical random factor to take into account the repeated measurement of the same population

3) Class|Genus|Species as nested categorical random factor to take into account the evolutionary history

4) ID_study as categorical random factor to take into account the different between studies and methodologies.

So I would like to mod

proc glimmix data=breed method=laplace;

class Season ID_study Species Genus Family Population ;

model breed =  Season

/dist=beta  solution ;

random ID_study;

random intercept / subject= Species (Genus*Family);

random Poblacion;

nloptions maxiter=150;

lsmeans Season/cl ilink;

run;

However, I would like to take into account the heterocedasticity between Seasons.


So I would like to know:

1) To put the Population as random is a good way to take into account the repeated measurement (most of them are sampling as regular intervals, but not all of them).

2) It is well code the nested and categorical random factor "Species|Genus|Family"?

3) Can I code a variance covariance matrix to take into account the heterocedasticity between different seasons, like varIdent in nlme package of R?


Thank you for your help and your time,

Rubén

6 REPLIES 6
SteveDenham
Jade | Level 19

I hope by heterogeneous variances you mean fitting an overdispersion effect, as the mean and variance of a beta distribution are functionally linked.  If so, then you will have to model it as an R-side effect, which means a change in the method from method=laplace to method=rspl.

proc glimmix data=breed method=rspl;

class Season ID_study Species Genus Family Population ;

model breed =  Season

/dist=beta  solution ;

random ID_study;

random intercept / subject= Species (Genus*Family);

random Population

random _residual_/group=season;

nloptions maxiter=150;

lsmeans Season/cl ilink;

run;

This should "work", but be cautious--the means are marginals, rather than conditional on the random effects.

And I'm still not absolutely certain that heterogeneous variances are meaningful under a beta distribution.

Steve Denham

RubenPnx
Calcite | Level 5

Fisrt of all, thanks for your invaluable help.

I mean the differences between the seasons in the graph residuals vs predictor lineal.

I got that plot when I execute your code and mine too.

ods graphics on;

proc glimmix data=breeding method=rspl plots=(ResidualPanel(marginal)

                       ResidualPanel(unpack conditional));

class Season Species Genus Family Poblacion ID_study ;

model B_mix_Total_Breeding =  Season

/dist=beta ;

random intercept / subject= Species (Genus*Family); /* To take into account the taxonomy (independence of the data) */

random Poblacion (ID_study); /* to take into account repeated measures (Poblacion) and different samplings from the multiples studies (ID_study) */

random _residual_/group=season; /* to take into account the overdispersion in each level of season */

nloptions maxiter=150;

lsmeans Season/cl ilink;

run;

I would like to know if I can say that my model is acceptable or I should solve it


residuals.JPG
SteveDenham
Jade | Level 19

If these plots came from the code with the random _residual_ statement, it appears that there really isn't a lot of heterogeneity in the variances.  And there looks to be an excellent fit in the QQ plot.  Under rspl, you could add a likelihood ratio test for variance homogeneity to the code:

covtest 'common variance' homogeneity;

for further evidence as to whether you need to be estimating additional parameters.  I suspect not.

Steve Denham

RubenPnx
Calcite | Level 5

I get a value of 0.05 in the covtest, so I can conclude that I have heterocedasticity in the seasons.

However I have been thinking that it may be an artefact by the way that I am modelling it. I mean that as I have monthly repeated measures of a population, which did not start at the same time, maybe I have to include the variables year and month to control the possible different effects of the month along the diverse years. In this way maybe I could control better the variance between seasons. I say this because the effect of a same month or seasons change along the years, and I should take it into account.

Will the following model take into account in a correct way the repeated measures of the population, as well as the potential differences of the variance between seasons? , or I should introduce the year too?

Applying the following model, the covtest gives me a value of 0.1449.

proc glimmix data=breed method=rspl plots=(ResidualPanel(marginal));

class Season Species Genus Family Population ID_study Month;

model B_mix_Total_Breeding =  Season Month (Season)    /* to control by the effect of months from different years */

/dist=beta ;

random intercept / subject = Population (ID_study);    /* to take into account the repeated measures of a population (Population)  and the different samplings (ID_study) */

random intercept / subject= Species (Genus*Family);    /* to take into account the taxonomy */

random _residual_/group=season;    /* to take into account the overdispersion between seasons */

nloptions maxiter=150;

lsmeans Season/cl diff PLOT=diff (NOabs center) adjust=tukey;

lsmeans Season/cl ilink;

covtest 'common variance' homogeneity;

run;

Rubén

SteveDenham
Jade | Level 19

I guess I would have set my test level for homogeneity of variance at a very small p value, say p<0.001, as the LRT is a lot like Bartlett's test and is really sensitive to outliers/non-normality of residuals  but...

Accomodating hetereogeneity is not a bad thing up front.  You might think about applying some sort of shrinkage, by including ddfm=kr2 (Kenward-Rogers adjustment) in the model statement.  If you have sufficient data, this should all be OK.

Steve Denham

RubenPnx
Calcite | Level 5

thank you very much you helped me a lot with models

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 6 replies
  • 1958 views
  • 3 likes
  • 2 in conversation