Turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

- Home
- /
- Analytics
- /
- Stat Procs
- /
- PROC GENMOD accounting for 2 repeated measures using GEE

Options

- RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Posted 06-14-2022 05:32 PM
(1086 views)

Hi, I am trying to use PROC GENMOD to fit a multinomial logistic model accounting for repeated measures using GEE. I have a categorical exposure with 3 categories and an ordinal outcome with 3 levels (low, medium, high). However, these exposures and outcomes result from pairwise comparisons between all individuals in my dataset so my unit of analysis is a pair, not an individual. The data table structure is as follows:

pair | sample1 | sample2 | exposure | outcome |

1 | A | B | category 1 | low |

2 | A | C | category 1 | medium |

3 | A | D | category 2 | high |

4 | B | C | category 3 | low |

5 | B | D | category 3 | low |

6 | C | D | category 2 | medium |

In this simplified example, there are a total of 4 samples (A,B,C and D) so 6 pairs when comparing each sample to all others. I want to account for the fact that each of these samples occurs in multiple pairs, however, I cannot figure out how to deal with the fact I have 2 samples per pair that I need to account for. I have used the following code to incorporate just sample1 as repeated measure, but is there a way to incorporate both sample1 and sample2 as the subject of the repeated measure? I receive various errors whenever I try to do so.

```
proc genmod data=my_data;
class exposure sample1;
model outcome= exposure / link=cumlogit;
repeated subject = sample1;
run;
```

Is there a way to accomplish this task?

5 REPLIES 5

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

The purpose of the SUBJECT= specification is to distinguish the observations that are considered correlated from those considered uncorrelated. The correlated observations should have the same value. It sounds like you consider all 6 of those observations to be correlated. If so, then you would need to create a variable with the same value for those 6 observations. Hopefully you have many sets of 4 samples yielding data on multiple sets of 6 observations. Keep in mind that validity of the GEE method requires a large number of subjects/clusters. Each set would have a unique value on the new variable in the data. You would then specify the new variable in SUBJECT=.

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Hi, so my original question wasn't very clear. I have a total of 100 samples. Each of those is compared to all of the others and that is the complete dataset. So the first 99 rows of my dataset are sample #1 in the sample1 column being compared to all 99 other samples in the sample2 column. The next 99 rows are then sample #2 in the sample1 column the being compared to the 98 other samples in the sample2 column it hadn't been compared to yet. So I need to account for the clusters of sample 1, sample 2 etc... and so on in each sample pair, but the data on those clusters occurs in 2 columns. I know I can account for the first sample of each pair by setting sample1 column as the SUBJECT=, but the issue I am having is then accounting for the clusters of samples in the sample2 column as well. Let me know if that doesn't make sense.

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Your design implies that you have 4950 pairs that are correlated (100 take 2). How many total observations do you have? If it isn't at least 49,500, GEE might not be the best tool. In fact, you may need to analyze your data in some other fashion designed for multinomial responses (FREQ, LOGISTIC, GENMOD, CATMOD) that would use the levels of pair and the levels of exposure (and their interaction, if possible) as factors.

SteveDenham

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Yes. My total dataset has 4,950 observations or pairs of samples. I have been trying to use PROC GENMOD with a cumulative logit link and repeated statement. When I run it like this I obtain 4950 clusters in the model, but I am not sure if this is correct or how to interpret the output really.

proc genmod data=recode descending;

class sample1 sample2 outcome(ref = '1');

model outcome=exposure/ link=clogit;

repeated subject = sample1(sample2);

run;

Note: I coded my outcome as 1 (low), 2 (medium), 3 (high)

proc genmod data=recode descending;

class sample1 sample2 outcome(ref = '1');

model outcome=exposure/ link=clogit;

repeated subject = sample1(sample2);

run;

Note: I coded my outcome as 1 (low), 2 (medium), 3 (high)

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

It appears to me that there is no repeated effect in your design - there is one outcome for each of the 4950 pairs. That leads to another question - how is exposure measured? The only estimation available is the marginal effect of exposure on outcome, averaged over all the pairs (I think). What do you get if you remove the REPEATED statement?

SteveDenham

Are you ready for the spotlight? We're accepting content ideas for **SAS Innovate 2025** to be held May 6-9 in Orlando, FL. The call is **open **until September 25. Read more here about **why** you should contribute and **what is in it** for you!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.