04-29-2017 08:31 PM
I am new to SAS. I am running a linear regression with PROC GENMOD that includes a product interaction term between a categorical exposure variable (rank) and a binary subgroup variable.
proc genmod data=data;
model outcome = rank subgroup rank*subgroup;
I want to see if there is a significant linear trend in the two categories of subgroup separately. I can get the estimates of outcome, but don't know how to do the test for trend. I'd like to do a Wald test so I can control for other covariates. Any suggestions on how to do this?
05-01-2017 06:52 PM
I think you actually are wanting an analysis of covariance (ANCOVA)-like model, in which you fit a regression for each level of a categorical predictor. To do that, one of the predictor variables will be a categorical (aka CLASS) variable and another will be a continuous variable.
In your scenario, I would guess that rank should serve as the continuous variable; if so, rank should not be in the CLASS statement. And I would guess that subgroup should serve as the categorical variable; if so, subgroup should be in the CLASS statement.
One parameterization for an ANCOVA-like model is
class subgroup; model outcome = rank subgroup rank*subgroup;
This model fits a linear regression of outcome on rank for each level of subgroup. The interaction provides a test of H0: slopes equal. If you understand how the categorical variable is coded, you can algebraically compute the intercepts and slopes for the two regressions.
An equivalent parameterization is
class subgroup; model outcome = subgroup rank*subgroup / noint;
The parameter estimates for subgroup are the intercepts, and the parameter estimates for rank*subgroup are slopes; no algebra required. The p-value for each slope estimate tests H0: slope=0.
Generally, we use the GENMOD procedure when the response variable follows a distribution other than the normal; for normal distribution we can use GLM (among others). But GENMOD does normal; if you don't specify a distribution option on the MODEL statement, then a normal distribution with an identity link is used by default.
An important note: the linear regression is on the link scale! If you are using normal distn/identity link, then the measurement units for parameter estimates (including slope) are the same as the original measurement units for outcome. If you are using GENMOD because outcome follows a distribution other than normal (and your code accidentally omitted that additional and necessary information), then the parameter estimates will be on the link scale (e.g., logit, log). A regression that is linear on the link scale will not be linear on the inverse link (original) scale except for the identity link. This feature has implications for interpretation.
There's a lot going on here, and in my opinion, ANCOVA is trickier than it looks at first glance, not to mention the potential complications of generalized linear models. Also, not to mention your desire to "control for other covariates". You'll want to devote time to gaining familiarity with the methodology as well as with SAS software.