I have a negative binomial model where the outcome is the number of complications and my main predictors are three different continuous factor scores that relate to where a disease is found in the body. I found an interaction between one of the factor scores (factor3) and how long, in year, there was a delay from first experiencing symptoms and subsequently receiving surgery. I tried to make an estimate statement but all the mean estimates come back as zero and there's no exp findings. I am looking at this using the mean factor score of each quartile of the variable (so the mean of quartile 1 of factor3 is -1.18, mean of q2 is -0.33, etc) and for diagnostic delay I want to know the difference between people with no diagnostic delay because their disease was found incidentially (diagnostic delay of 0) and those who have an average diagnostic delay in my population (9.8 years). I am using sas 9.4. Any help is greatly appreciated.
My simplified code is below
proc genmod data =data; model num_complications = factor1 factor2 factor3 white BMI diagnosticdelay alcohol_use smoking factor3*diagnosticdelay /type3 dist=negbin; estimate 'average q1 by incidential findings' intercept 1 diagnosticdelay 0 factor3 -1.18 diagnosticdelay*factor3 0/ exp; estimate 'average q2 by incidential findings' intercept 1 diagnosticdelay 0 factor3 -0.33 diagnosticdelay*factor3 0/ exp; estimate 'average q3 by incidential findings' intercept 1 diagnosticdelay 0 factor3 0.17 diagnosticdelay*factor3 0/ exp; estimate 'average q4 by incidential findings' intercept 1 diagnosticdelay 0 factor3 1.30 diagnosticdelay*factor3 0/ exp; estimate 'average q1 by average delay' intercept 1 diagnosticdelay 9.8 factor3 -1.18 diagnosticdelay*factor3 -113.3272/ exp; estimate 'average q2 by average delay' intercept 1 diagnosticdelay 9.8 factor3 -0.33 diagnosticdelay*factor3 -3.234/ exp; estimate 'average q3 by average delay' intercept 1 diagnosticdelay 9.8 factor3 0.17 diagnosticdelay*factor3 1.66/ exp; estimate 'average q4 by average delay' intercept 1 diagnosticdelay 9.8 factor3 1.30 diagnosticdelay*factor3 12.74/ exp; run;
Because of the interaction, the effect of changing the delay from 0 to 9.8 will differ depending on the level of the interacting variable, factor3. Because the variables involved are continuous, you'll probably want to look through this note (if both variables were CLASS variables then you could use an LSMEANS statement followed by the NLMeans macro as shown for a similar case involving rates in this note). As shown there for a log-linked model, one way to do this is using the Margins macro to obtain the marginal effect for one variable at various values of the interacting variable.
For example, consider this variation on the Poisson model fit to the insurance data in the Getting Started section of the GENMOD documentation. Here, both the car age (AGE) and size (CARNUM) are treated as continuous predictors and the data are altered to create interaction between them. Also, the count is modeled rather than the rate (since no offset is used). The following statements first use GENMOD directly to fit the Poisson model with interaction and to provide a contour plot of the fitted model. The Margins macro then is used to refit the model (it uses GENMOD) and to estimate the effect of changing AGE from 1 to 2 at each of three car size settings (CARNUM=1, 2, and 3). The macro provides estimates at each of those six settings and estimates of the difference at each car size. You can compare the estimates to the plot to see that they make sense.
data insure;
input N C CAR $ AGE LN CARNUM;
datalines;
500 42 small 1 6.21461 1
1200 37 medium 1 7.09008 2
100 1 large 1 4.60517 3
400 10 small 2 5.99146 1
500 73 medium 2 6.21461 2
300 64 large 2 5.70378 3
;
proc genmod data=insure;
model c=age|carnum / dist=poisson;
effectplot;
run;
data mdat;
do age=1,2;
output;
end;
run;
data adat;
do carnum=1 to 3; output; end;
run;
%Margins(data=insure, response=c, model=age|carnum, dist=poisson,
margins=age, margindata=mdat, at=carnum, atdata=adat, options=diff reverse cl)
Because of the interaction, the effect of changing the delay from 0 to 9.8 will differ depending on the level of the interacting variable, factor3. Because the variables involved are continuous, you'll probably want to look through this note (if both variables were CLASS variables then you could use an LSMEANS statement followed by the NLMeans macro as shown for a similar case involving rates in this note). As shown there for a log-linked model, one way to do this is using the Margins macro to obtain the marginal effect for one variable at various values of the interacting variable.
For example, consider this variation on the Poisson model fit to the insurance data in the Getting Started section of the GENMOD documentation. Here, both the car age (AGE) and size (CARNUM) are treated as continuous predictors and the data are altered to create interaction between them. Also, the count is modeled rather than the rate (since no offset is used). The following statements first use GENMOD directly to fit the Poisson model with interaction and to provide a contour plot of the fitted model. The Margins macro then is used to refit the model (it uses GENMOD) and to estimate the effect of changing AGE from 1 to 2 at each of three car size settings (CARNUM=1, 2, and 3). The macro provides estimates at each of those six settings and estimates of the difference at each car size. You can compare the estimates to the plot to see that they make sense.
data insure;
input N C CAR $ AGE LN CARNUM;
datalines;
500 42 small 1 6.21461 1
1200 37 medium 1 7.09008 2
100 1 large 1 4.60517 3
400 10 small 2 5.99146 1
500 73 medium 2 6.21461 2
300 64 large 2 5.70378 3
;
proc genmod data=insure;
model c=age|carnum / dist=poisson;
effectplot;
run;
data mdat;
do age=1,2;
output;
end;
run;
data adat;
do carnum=1 to 3; output; end;
run;
%Margins(data=insure, response=c, model=age|carnum, dist=poisson,
margins=age, margindata=mdat, at=carnum, atdata=adat, options=diff reverse cl)
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.
Find more tutorials on the SAS Users YouTube channel.