turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

Find a Community

- Home
- /
- Analytics
- /
- Stat Procs
- /
- repeated measures in Proc Genmod

Topic Options

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

07-29-2009 03:13 PM

I am writing because I would need some advice on the following question. I am working on paternity in a monogamous bird species and I am performing analyses to check whether the probability for a male to be cuckolded (binary variable) depends on his body size, the body size of his female, the degree of genetic relatedness to his female and nest density around his own nest (all continuous variables).

Since I have data for two years (2002 and 2003), I think that the best solution is to conduct a logistic regression for repeated measures using Proc GENMOD:

proc genmod data = cuckold descending;

class maleIdentity femaleIdentity year;

model cuckoldry = maleSize femaleSize relatedness density / dist = bin link = logit;

repeated subject = maleIdentity * femaleIdentity / within=year type=unstr;

output out = genmod_fit_i p=phat;

run;

However, I am a bit worried to use my entire data set. Indeed, a few individuals changed partner between 2002 and 2003 (they divorced or became widowed). For some other pairs, I have data for one year only (the birds did not attempt to breed the other year).

Under these conditions, I am wondering what I should do. Shall I use my whole data set? Shall I use a subset including only those males for which I have data for both years (in this subset, two males changed female between 2002 and 2003, but no female changed male. If I use repeated subject = male * female, I will account for the non-independence of the pairs, but I will have twice two clusters which are not independent from each other since they will have the same male, and these four clusters will only have one year of data)? Or shall I use only the pairs for which I have two years of data (in this case, I will only have 13 clusters, vs 55 if I use the whole data set)?

Thank you in advance for your help. Message was edited by: JBHorta

Since I have data for two years (2002 and 2003), I think that the best solution is to conduct a logistic regression for repeated measures using Proc GENMOD:

proc genmod data = cuckold descending;

class maleIdentity femaleIdentity year;

model cuckoldry = maleSize femaleSize relatedness density / dist = bin link = logit;

repeated subject = maleIdentity * femaleIdentity / within=year type=unstr;

output out = genmod_fit_i p=phat;

run;

However, I am a bit worried to use my entire data set. Indeed, a few individuals changed partner between 2002 and 2003 (they divorced or became widowed). For some other pairs, I have data for one year only (the birds did not attempt to breed the other year).

Under these conditions, I am wondering what I should do. Shall I use my whole data set? Shall I use a subset including only those males for which I have data for both years (in this subset, two males changed female between 2002 and 2003, but no female changed male. If I use repeated subject = male * female, I will account for the non-independence of the pairs, but I will have twice two clusters which are not independent from each other since they will have the same male, and these four clusters will only have one year of data)? Or shall I use only the pairs for which I have two years of data (in this case, I will only have 13 clusters, vs 55 if I use the whole data set)?

Thank you in advance for your help. Message was edited by: JBHorta

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

08-11-2009 11:01 AM

Assuming that the response is recorded for each male in each year, I would think you could fit a model using all of your data with your male identifier as the SUBJECT= effect ijn the REPEATED statement. This indicates that all observations with the same value of the male identifier are correlated and observations with different values are uncorrelated. The GEE method implemented by the REPEATED allows unequal numbers of measurements among the subjects, so it is not a problem if some males are observed only once.