BookmarkSubscribeRSS Feed
raheleh22
Obsidian | Level 7

I have a data set which my outcome variable is count, and it is not over-dispersed so I am using proc genmod with dist= poisson and link=log. this is my model: 

model counts= theme1 theme2 theme3 theme4 overall / dist=poisson link=log;
run; 

the counts in my excel are per county and I wonder How I should get estimates according to my model per county. 

I have tried if and where statement for the county variable but none of them worked. 

any advice is appreciated. 

Thanks 

 

6 REPLIES 6
StatDave
SAS Super FREQ
It's not clear what your data set is like and what it is that you want. If you want a separate model for each county and if you have multiple observations for each county with a count and predictor values in each observation, then you can just add a BY COUNTY; statement in your GENMOD step (after sorting by COUNTY). Or, if you want a single model that adjusts for COUNTY differences, then maybe you need to put COUNTY in the CLASS and MODEL statements, possibly including interactions of COUNTY with your predictors if needed.
raheleh22
Obsidian | Level 7

I have shown my dataset below and this is the model I used: 

data time1;
input county counts theme1 theme2 theme3 theme4 overall;
ln= log(counts);
datalines;

proc genmod data=time1;
class county;
model counts= theme1 theme2 theme3 theme4 Overall county/dist=poisson link=log offset=ln;
run;

after running this model I am getting 0 estimate for all counties. I ma not sure which step I am doing wrong. 

I want to get estimates for each theme per county. 

So this is how my dataset looks: 

COUNTYCountsTheme1Theme2Theme3Theme4Overall
13510.92860.85710.64291
21120.21430.71430.57140.57140.4286
3610.357100.28570.92860.2857
4840.57140.57140.07140.35710.5
5300.64290.35710.42860.78570.7143
680.14290.42860.35710.21430.2143
7300.50.28570.42860.50.6429
8922700.07140.71430.28570.0714
91140.64290.57140.14290.42860.5714
10930.928610.78570.85710.9286
1120700.21430.14290.57140.64290.3571
1211090.42860.21430.21430.07140.1429
13280.78570.857110.14290.7857
14990.07140.4286000
15650.85710.78570.928610.8571
StatDave
SAS Super FREQ
You shouldn't use the log of your response variable as an offset. The offset is just another predictor in the model with its parameter restricted to equal 1. As a result, you effectively are modeling a constant rate of 1. So, just remove OFFSET=LN from your MODEL statement.
StatDave
SAS Super FREQ

But you won't be able to estimate a parameter for every county AND estimate the parameters of the predictors since that results in trying to estimate more parameters than there are observations. The only way could estimate the parameters for the predictors separately for each county is by having a set of observations for every county. You can't do it with only one observation for each county.

raheleh22
Obsidian | Level 7

that is helpful. Now I am rearranging my dataset, so I have separated counts for each county. now the new dataset is only one county and includes counts, theme1 theme2, theme3, theme4, overall (all of these are continues) and year ( categorical). in my new model: 

proc genmod data=county1;
class year(ref='1');
model counts= theme1 theme2 theme3 theme4 overall year/dist=poisson link=log;
run;

still the output is giving the estimates of year categories seperate and estimate of themes seperate. 

so how can I get the estimate of themes by years categories? 

Thanks a lot, 

StatDave
SAS Super FREQ
To do that you would have to include interactions between YEAR and the other predictors such as
model counts = year theme1*year theme2*year theme3*year theme4*year / dist=poisson;
But you will have the same problem if the number of parameters exceeds the number of observations. For instance, if each observation in your new data set is for a separate year, then you essentially the same problem as before.

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!

Submit your idea!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 6 replies
  • 664 views
  • 4 likes
  • 2 in conversation