Interpreting PROC GENMOD output

newtriks · Posted 08-11-2023 05:50 PM

Hello, this may be a stupid question but I'm having trouble interpreting my output.

Code:
DATA NPS;
INPUT Year Visits EFG STU;
YearIndx = Year-2012;
LogVisits = Log(Visits/1000000);
Event=1; Incident=EFG; Rate=EFG/(Visits/1000000); OUTPUT;
Event=0; Incident=STU; Rate=STU/(Visits/1000000); OUTPUT;
DATALINES;
2013 273630895 4187 620
2014 292800082 5498 796
2015 307247252 6160 1283
2016 330971689 6753 3196
2017 330882751 6605 3114
2018 318211833 6111 3214
2019 327516619 2820 2976
2020 237064332 1463 2993
;
PROC GENMOD DATA=BAKER.NPSvisittrendsCOVID plots=all;
model STU = year / dist=negbin link=log offset=LogVisits type3;
RUN;

Maximum likelihood parameter estimates from PROC GENMOD:
Parameter Estimate Standard Error Wald 95% Confidence Limits Wald Chi-Square Pr>ChiSq
Intercept -504.867 93.7332 -688.580 -321.153 29.01 <.0001
Year 0.2445 0.0465 0.1533 0.3356 27.66 <.0001
Dispersion 0.0737 0.0367 0.0278 0.1955

The way I'm interpreting is this: Exp(-504.867 + Year*0.2445) = STU. This is clearly wrong, because when I calculate that I get nothing close to the STU number. What am I missing?? Thanks in advance.

Hello, this may be a stupid question but I'm having trouble interpreting my output. Code: DATA NPS;INPUT Year Visits EFG STU;YearIndx = Year-2012;LogVisits = Log(Visits/1000000);Event=1; Incident=EFG; Rate=EFG/(Visits/1000000); OUTPUT;Event=0; Incident=STU; Rate=STU/(Visits/1000000); OUTPUT;DATALINES;2013 273630895 4187 620 2014 292800082 5498 796 2015 307247252 6160 1283 2016 330971689 6753 3196 2017 330882751 6605 3114 2018 318211833 6111 3214 2019 327516619 2820 2976 2020 237064332 1463 2993 ;PROC GENMOD DATA=BAKER.NPSvisittrendsCOVID plots=all; model STU = year / dist=negbin link=log offset=LogVisits type3;RUN;Maximum likelihood parameter estimates from PROC GENMOD:Parameter Estimate Standard Error Wald 95% Confidence Limits Wald Chi-Square Pr>ChiSqIntercept -504.867 93.7332 -688.580 -321.153 29.01 <.0001Year 0.2445 0.0465 0.1533 0.3356 27.66 <.0001Dispersion 0.0737 0.0367 0.0278 0.1955The way I'm interpreting is this: Exp(-504.867 + Year*0.2445) = STU. This is clearly wrong, because when I calculate that I get nothing close to the STU number. What am I missing?? Thanks in advance.

FreelanceReinh · Posted 08-12-2023 05:27 AM

Hello @newtriks,

@newtriks wrote:
DATA=BAKER.NPSvisittrendsCOVID plots=all;
model STU = year / dist=negbin link=log offset=LogVisits type3;
RUN;

Maximum likelihood parameter estimates from PROC GENMOD:
Parameter Estimate Standard Error Wald 95% Confidence Limits Wald Chi-Square Pr>ChiSq
Intercept -504.867 93.7332 -688.580 -321.153 29.01 <.0001
Year 0.2445 0.0465 0.1533 0.3356 27.66 <.0001
Dispersion 0.0737 0.0367 0.0278 0.1955

The way I'm interpreting is this: Exp(-504.867 + Year*0.2445) = STU. This is clearly wrong, because when I calculate that I get nothing close to the STU number. What am I missing??

The offset is missing. Exp(LogVisits - 504.867 + Year*0.2445) will be closer to STU.

newtriks · Posted 08-14-2023 03:39 PM

Thanks for responding - it still doesn't appear to work, though.

Let's take 2020, for example. Logvisits = log(park_visits/1000000), or log(237.064332), which equals an offset of 5.471.

So the expression yielding the predicted value would be exp(5.471 - 504.867 + 2020*0.2445). This yields 0.004 predicted, 2993 actual.

I'm doing something wrong but I can't place my finger on it.

Any help you might provide would be greatly appreciated. Thanks!

ballardw · Posted 08-14-2023 04:06 PM

I don't use Genmod so walk me through what your NPS is doing. I think this may be important as you show us code for NPS, use a different set NPSvisittrendsCOVID. The NPS set you create variables Event and Incident but do not use them anywhere in the Genmod that I see. So are you sure that Genmod code is correct for the shown data set??? When I run the given data set with that Genmod the results are not as you show. So something seems a bit off:

Different intercept estimate and all the standard errors as a start.

Analysis Of Maximum Likelihood Parameter Estimates
Parameter	DF	Estimate	Standard Error	Wald 95% Confidence Limits		Wald Chi-Square	Pr > ChiSq
Intercept	1	-491.051	66.2794	-620.956	-361.146	54.89	<.0001
Year	1	0.2445	0.0329	0.1800	0.3089	55.31	<.0001
Dispersion	1	0.0737	0.0259	0.0370	0.1469

It is confusing to introduce terms like your "park_visits" that do not appear in the data. If I have to guess that a variable named "visits" is supposed to be treated as "park_visits" I get very uncomfortable as I have seen just too much data with similar variable names to like that sort of assumption.

FreelanceReinh · Posted 08-15-2023 03:50 AM

@newtriks wrote:

Thanks for responding - it still doesn't appear to work, though.

Let's take 2020, for example. Logvisits = log(park_visits/1000000), or log(237.064332), which equals an offset of 5.471.

log(237.064332)=5.468331...

As ballardw has pointed out already, your intercept estimate -504.867 is not consistent with your data, for which your PROC GENMOD code ([edit:] i.e., applied to dataset NPS) yields -491.051 . The seemingly small relative difference between these numbers has a big impact when the exponential function is applied: The result for 2020 is 4053.48 (same order of magnitude as STU=2993) as opposed to 0.004051... The factor (close to) 1,000,000 (namely exp(504.867-491.051)) between these results suggests that your incorrect intercept is due to a missing division (or multiplication) by 1,000,000 at some point in your calculation.

Interpreting PROC GENMOD output

Re: Interpreting PROC GENMOD output

Re: Interpreting PROC GENMOD output

Re: Interpreting PROC GENMOD output

Re: Interpreting PROC GENMOD output

Interpreting PROC GENMOD output

Re: Interpreting PROC GENMOD output

Re: Interpreting PROC GENMOD output

Re: Interpreting PROC GENMOD output

Re: Interpreting PROC GENMOD output

SAS Innovate 2025: Register Now