11-27-2016 11:59 AM
In a paper on reintroduced cranes, I have data like this: ID, Sex, Year, Out* (e.g., 1-01, M, 2002, 1)
I want to show that females disperse more than males.
I used chi-square. Reviewer indicated this violated assumption of independence because of multiple years of data for same birds and that I should use GLIMMIX.
What are lines of code needed to set up model?
SAS version 9.x (latest)
*Dispersal: 1 = Out of reintroduction area. 0 = In reintro. area.
I tried this:
id = identification code for each bird.
out = dispersal value (=1 if out of reintro. area); each bird has data in 1 or more years (observations).
title 'difference in dispersal by sex including all birds';
class sex id;
model out = sex / link=logit dist=binomial;
random int / subject=id;
Produces these results: Type III Tests of Fixe Effects:
Effect = Sex, Num DF = 1, Den DF = 568, F Value = 5.26, Pr > F = 0.0221
Does this look correct? Thanks, Richard
11-27-2016 04:33 PM
Is this a real experimental dataset or some academic exercise?
I tried the following:
libname xl Excel "&sasforum\datasets\crane dispersal data.xlsx" access=readonly; proc sql; create table cranes as select * from xl.'Sheet1$'n; quit; proc sql; create table firstSighthings as select * from cranes group by ID having year = min(year); quit; proc freq data=firstSighthings; table sex*out / chisq; run;
What surprises me is that the frequency table shows a perfect equilibrium in first sightings locations (79, for both in and out).
Otherwise, the Chi-square and Fisher's tests indicate that initial dispersion of females is significantly greater than initial dispersion of males. That, in itself, shows a difference in behaviour.
11-28-2016 02:10 AM
Yes, this is real data being analyzed for publication. Using chi-square to look at difference in first-year dispersal between males and females is of course one question that could be asked, but the question here is difference in dispersal over all years of birds' lives. Is GLIMMIX set up with the indicated code appropriate for that analysis?
11-28-2016 11:35 PM
I think it is. I would write it as
proc glimmix data=cranes; class sex id; model out(event='1') = sex / link=logit dist=binary solution; random int / subject=id; lsmeans sex / ilink; run;
and get essentially the same results.
12-01-2016 09:06 AM
And to accommodate year, add on to @PGStats' code:
proc glimmix data=cranes; class sex year id; model out(event='1') = sex|year / link=logit dist=binary solution; random int / subject=id; random year/residual subject=id type=unr; lsmeans sex year sex*year/ ilink; run;
I picked an unstructured correlation matrix (type=unr) assuming that the number of years was small.
12-01-2016 09:40 AM
Thanks Steve. I now have 4 sets of codes for the PROC GLIMMIX problem and am working on which one to use. I don't think the YEAR variable is relevant to this problem; it is basically functioning only as an observation no.
12-01-2016 10:10 AM
Probably that is the case, but this would address the reviewer's concern outlined in the first post. It would also give you the correlations over time of the observations.