<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic PROC GENMOD GEE with overdispersed data in Statistical Procedures</title>
    <link>https://communities.sas.com/t5/Statistical-Procedures/PROC-GENMOD-GEE-with-overdispersed-data/m-p/830456#M41125</link>
    <description>&lt;P&gt;Hi all.&amp;nbsp;&lt;/P&gt;&lt;P class=""&gt;I'm interested in analyzing how different affective traits predict behaviors during time-out (in the context of a treatment program) using GEE in SAS.&lt;/P&gt;&lt;P class=""&gt;- I have 49 youth, and my outcome is an overdispersed count variable with a negative binomial distribution. I also want to account for nesting of 3 levels: time-outs are nested within time [i.e., treatment week], and time is nested within child (ID).&lt;/P&gt;&lt;P class=""&gt;- It is an unbalanced dataset, such that some kids got 30 time-outs while others only got 1 or 2.&lt;/P&gt;&lt;P class=""&gt;- I've included # of timeouts within day as an offset term to account for the fact that youth with 1 time-out in a day are diff than youth with 4 time-outs in a day.&lt;/P&gt;&lt;P class=""&gt;- I'm also interested in various interactions but haven't included them yet because I want to ensure that my model is correct.&lt;/P&gt;&lt;P class=""&gt;&amp;nbsp;&lt;/P&gt;&lt;P class=""&gt;Here's my syntax so far with centered (c) predictors:&lt;/P&gt;&lt;P class=""&gt;Title "Model 1";&lt;/P&gt;&lt;P class=""&gt;PROC GENMOD DATA = data.dataset;&lt;/P&gt;&lt;P class=""&gt;CLASS ID cweek;&lt;/P&gt;&lt;P class=""&gt;MODEL Behavior = meds age cx1 cx2 cx3 cweek /&lt;/P&gt;&lt;P class=""&gt;dist = negbin link = log offset = timeout_withindaycount;&lt;/P&gt;&lt;P class=""&gt;REPEATED SUBJECT = ID(cweek) / type = exch;&lt;/P&gt;&lt;P class=""&gt;RUN;&lt;/P&gt;&lt;P class=""&gt;&amp;nbsp;&lt;/P&gt;&lt;P class=""&gt;My repeated line is probably wrong, as my time variable isn't a categorical variable but produces individual effects for all 8 weeks. However,&amp;nbsp;I'm unsure how to properly include nesting across the 3 levels if time isn't on the CLASS statement. Any feedback on where I've gone wrong would be great. Sorry for the elementary questions.&amp;nbsp;&lt;/P&gt;&lt;P class=""&gt;Thanks!&lt;/P&gt;</description>
    <pubDate>Thu, 25 Aug 2022 23:32:42 GMT</pubDate>
    <dc:creator>PSB</dc:creator>
    <dc:date>2022-08-25T23:32:42Z</dc:date>
    <item>
      <title>PROC GENMOD GEE with overdispersed data</title>
      <link>https://communities.sas.com/t5/Statistical-Procedures/PROC-GENMOD-GEE-with-overdispersed-data/m-p/830456#M41125</link>
      <description>&lt;P&gt;Hi all.&amp;nbsp;&lt;/P&gt;&lt;P class=""&gt;I'm interested in analyzing how different affective traits predict behaviors during time-out (in the context of a treatment program) using GEE in SAS.&lt;/P&gt;&lt;P class=""&gt;- I have 49 youth, and my outcome is an overdispersed count variable with a negative binomial distribution. I also want to account for nesting of 3 levels: time-outs are nested within time [i.e., treatment week], and time is nested within child (ID).&lt;/P&gt;&lt;P class=""&gt;- It is an unbalanced dataset, such that some kids got 30 time-outs while others only got 1 or 2.&lt;/P&gt;&lt;P class=""&gt;- I've included # of timeouts within day as an offset term to account for the fact that youth with 1 time-out in a day are diff than youth with 4 time-outs in a day.&lt;/P&gt;&lt;P class=""&gt;- I'm also interested in various interactions but haven't included them yet because I want to ensure that my model is correct.&lt;/P&gt;&lt;P class=""&gt;&amp;nbsp;&lt;/P&gt;&lt;P class=""&gt;Here's my syntax so far with centered (c) predictors:&lt;/P&gt;&lt;P class=""&gt;Title "Model 1";&lt;/P&gt;&lt;P class=""&gt;PROC GENMOD DATA = data.dataset;&lt;/P&gt;&lt;P class=""&gt;CLASS ID cweek;&lt;/P&gt;&lt;P class=""&gt;MODEL Behavior = meds age cx1 cx2 cx3 cweek /&lt;/P&gt;&lt;P class=""&gt;dist = negbin link = log offset = timeout_withindaycount;&lt;/P&gt;&lt;P class=""&gt;REPEATED SUBJECT = ID(cweek) / type = exch;&lt;/P&gt;&lt;P class=""&gt;RUN;&lt;/P&gt;&lt;P class=""&gt;&amp;nbsp;&lt;/P&gt;&lt;P class=""&gt;My repeated line is probably wrong, as my time variable isn't a categorical variable but produces individual effects for all 8 weeks. However,&amp;nbsp;I'm unsure how to properly include nesting across the 3 levels if time isn't on the CLASS statement. Any feedback on where I've gone wrong would be great. Sorry for the elementary questions.&amp;nbsp;&lt;/P&gt;&lt;P class=""&gt;Thanks!&lt;/P&gt;</description>
      <pubDate>Thu, 25 Aug 2022 23:32:42 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Statistical-Procedures/PROC-GENMOD-GEE-with-overdispersed-data/m-p/830456#M41125</guid>
      <dc:creator>PSB</dc:creator>
      <dc:date>2022-08-25T23:32:42Z</dc:date>
    </item>
    <item>
      <title>Re: PROC GENMOD GEE with overdispersed data</title>
      <link>https://communities.sas.com/t5/Statistical-Procedures/PROC-GENMOD-GEE-with-overdispersed-data/m-p/830467#M41126</link>
      <description>&lt;P&gt;I assume that your response variable, behavior, is a count of some particular behavior type that a kid might exhibit. First, note that the GEE model that GENMOD fits is not a multi-level model (and I should note that the newer PROC GEE is generally the recommended procedure to use for fitting GEE models). The SUBJECT= effect simply designates which observations are considered correlated (in whatever way as controlled by the TYPE= option) - observations with the same value of the SUBJECT= effect are considered correlated; observations with different values are considered uncorrelated. In your case, it sounds like you would consider all observations within a kid to be correlated, so, assuming that each kid has a unique value of the ID variable, you would just specify SUBJECT=ID. Next, when you include an offset in a log-linked model, then you are trying to model the log of a rate - the rate being the count of some event over some exposure size or population size. To do that, the OFFSET= variable should be the log of the denominator of the rate - that is, the log of the exposure or population size. It should not be just the size. And I will note that the GEE model itself is a way of dealing with overdispersion as mentioned in &lt;A href="http://support.sas.com/kb/22630" target="_self"&gt;this note&lt;/A&gt;. So, it might not be worthwhile to try to use the negative binomial distribution instead of the Poisson distribution since the negative binomial model is much more prone to fitting problems.&lt;/P&gt;</description>
      <pubDate>Fri, 26 Aug 2022 02:58:43 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Statistical-Procedures/PROC-GENMOD-GEE-with-overdispersed-data/m-p/830467#M41126</guid>
      <dc:creator>StatDave</dc:creator>
      <dc:date>2022-08-26T02:58:43Z</dc:date>
    </item>
  </channel>
</rss>

