In a paper on reintroduced cranes, I have data like this: ID, Sex, Year, Out* (e.g., 1-01, M, 2002, 1)
I want to show that females disperse more than males.
I used chi-square. Reviewer indicated this violated assumption of independence because of multiple years of data for same birds and that I should use GLIMMIX.
What are lines of code needed to set up model?
SAS version 9.x (latest)
*Dispersal: 1 = Out of reintroduction area. 0 = In reintro. area.
I tried this:
id = identification code for each bird.
out = dispersal value (=1 if out of reintro. area); each bird has data in 1 or more years (observations).
This code:
proc glimmix;
title 'difference in dispersal by sex including all birds';
class sex id;
model out = sex / link=logit dist=binomial;
random int / subject=id;
run;
Produces these results: Type III Tests of Fixe Effects:
Effect = Sex, Num DF = 1, Den DF = 568, F Value = 5.26, Pr > F = 0.0221
Does this look correct? Thanks, Richard
Is this a real experimental dataset or some academic exercise?
I tried the following:
libname xl Excel "&sasforum\datasets\crane dispersal data.xlsx" access=readonly;
proc sql;
create table cranes as
select *
from xl.'Sheet1$'n;
quit;
proc sql;
create table firstSighthings as
select *
from cranes
group by ID
having year = min(year);
quit;
proc freq data=firstSighthings;
table sex*out / chisq;
run;
What surprises me is that the frequency table shows a perfect equilibrium in first sightings locations (79, for both in and out).
Otherwise, the Chi-square and Fisher's tests indicate that initial dispersion of females is significantly greater than initial dispersion of males. That, in itself, shows a difference in behaviour.
Yes, this is real data being analyzed for publication. Using chi-square to look at difference in first-year dispersal between males and females is of course one question that could be asked, but the question here is difference in dispersal over all years of birds' lives. Is GLIMMIX set up with the indicated code appropriate for that analysis?
Thanks PG,
R
I think it is. I would write it as
proc glimmix data=cranes;
class sex id;
model out(event='1') = sex / link=logit dist=binary solution;
random int / subject=id;
lsmeans sex / ilink;
run;
and get essentially the same results.
And to accommodate year, add on to @PGStats' code:
proc glimmix data=cranes;
class sex year id;
model out(event='1') = sex|year / link=logit dist=binary solution;
random int / subject=id;
random year/residual subject=id type=unr;
lsmeans sex year sex*year/ ilink;
run;
I picked an unstructured correlation matrix (type=unr) assuming that the number of years was small.
Steve Denham
Thanks Steve. I now have 4 sets of codes for the PROC GLIMMIX problem and am working on which one to use. I don't think the YEAR variable is relevant to this problem; it is basically functioning only as an observation no.
Richard
Probably that is the case, but this would address the reviewer's concern outlined in the first post. It would also give you the correlations over time of the observations.
Steve Denham
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.
Find more tutorials on the SAS Users YouTube channel.