Turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

- Home
- /
- Analytics
- /
- Stat Procs
- /
- GEE with clustering variable in model part

Options

- RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page

🔒 This topic is **solved** and **locked**.
Need further help from the community? Please
sign in and ask a **new** question.

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Posted 04-23-2020 11:10 AM
(1038 views)

I have data, number of persons killed in road accident, collected in one state (all 10 administrative divisions), every single day, between November 1st to march 31st, for 16 years (from 2005 to 2020). I am running a population average model (Poisson, negative binomial for count data).

My interest is to model the impact of low temperature (I have daily average temp from November, 1st 2005 to march, 31st 2020) on number of persons killed.

Primarily, I would like to get an overall effect (Incidence Rate Ratio) at state level, but I am also interested in at each division effect. Intuitively, I considered running GEE models with year nested in division as cluster. So in order to get these effects, I run two separate models, as below.

The question I have, is it correct to have a clustering variable in both parts, i.e. in repeated and model parts?

Thanks in advance

- Overall state IRR:

**proc** **genmod** data=claims ;class x1 year division;

model claims = Temp x1 x2 year /offset=log_workforce dist=nb link= log;

estimate "State" Temp **1** / exp;

repeated subject=year(division)/type=ar;

**run**;

- Division IRR

**proc** **genmod** data=claims ;class x1 year division;

model claims = Temp|division x1 x2 year/ offset=log_workforce dist=nb link= log;

repeated subject=year(division)/type=ar;

estimate "div 1" Temp **1** Temp*division **1** **0** **0** **0** **0** **0** **0** **0** **0** **0**/ exp;

estimate "div 2" Temp **1** Temp*division **0** **1** **0** **0** **0** **0** **0** **0** **0** **0**/ exp;

estimate "div 10" Temp **1** Temp*division **0** **0** **0** **0** **0** **0** **0** **0** **0 1**/ exp;

**run**;

1 ACCEPTED SOLUTION

Accepted Solutions

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

- Tags:
- rate ratio

6 REPLIES 6

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Hi StatDave_sas,

Thanks for your response. Much appreciated for the links.

I think I incorrectly stated «Overall state IRR » . Indeed there is one state (which does not appear in the model), with 10 administrative divisions (whose I considered as cluster). Over the specified period, for each day, in each administrative division, we have the count of persons killed in traffic and the daily mean temperature. So I am modeling the mean daily count of persons killed as function of mean daily temperature.

Now, by « overall state IRR », I would like to get the effect for a 1 unit increase in the mean daily temperature for the entire state, as depicted in the first genmod.

Likewise, to get the IRR at division level, e.i. the effect for a 1 unit increase in the mean daily temperature for the each division, as depicted in the second genmod. At this point, I came to include the temperature by division interaction in the model in order to get the IRR, but also in the repeated part, being defined as cluster.

Now, I am not sure I can get an effect with lsmeans for a single continous variable, like in model 1, unless it involved in an interaction with a categorical variable.

Thanks again for your help, I will appreciate any input,

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

- Tags:
- rate ratio

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Thanks Sir, much appreciated

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Exactly

**SAS Innovate 2025** is scheduled for May 6-9 in Orlando, FL. Sign up to be **first to learn** about the agenda and registration!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.