- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hi All,
I am running a multiple logistic regression analysis with mortality (event = 1) as the dependent variable and various ranges of age (65-69, 70-74, 75-79, 80+) as the independent variables (see code below). When running the analysis, the variable "Age 80+" is displaying zero degrees of freedom and there is no subsequent result generated (see image below)
The code is as follows:
"proc logistic data=WORK.OBESITY;
class Age_65_69 Age_70_74 Age_75_79 Age_80_ / param=glm descending;
model Overall_Mortality(event='1')=Age_65_69 Age_70_74 Age_75_79 Age_80_ /
link=logit technique=fisher;
run;"
Reviewing the data, each independent variable has either 0 or 1 as possible inputs, and none of the columns share the same sum.
I would greatly appreciate any possible explanation and subsequent solution for this problem!
Thanks!
Accepted Solutions
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
The results are correct. Your age variable has 4 levels and therefore 3 degrees of freedom meaning that only 3 independent parameters can be estimated. The best, and easiest, way to do this is not to create separate indicator variables as you have done, but rather to create a single age variable with 4 distinct values indicating which range the observations fall in. Then, specify that single age variable in the CLASS and MODEL statement. The result will still show 3 parameter estimates for this 4 level variable.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
The results are correct. Your age variable has 4 levels and therefore 3 degrees of freedom meaning that only 3 independent parameters can be estimated. The best, and easiest, way to do this is not to create separate indicator variables as you have done, but rather to create a single age variable with 4 distinct values indicating which range the observations fall in. Then, specify that single age variable in the CLASS and MODEL statement. The result will still show 3 parameter estimates for this 4 level variable.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
I suspect that there are no valid responses for the 80+ group. You can use PROC FREQ to examine the crosstab:
proc freq data=Obesity;
tables Overall_Mortality*Age_80_ / nocol norow nopercent missing;
run;
I suggest that your analysis might be more interpretable if you create ONE categorical variable that has the age levels that you want. You can use PROC FORMAT to bin the Age variable into categories. Here is an example that uses the Sashelp.Class data:
proc format;
value AgeFmt
low - 13 = "11-13"
14 - 16 = "14-16"
17 - high = "17+";
run;
proc print data=Sashelp.Class;
format Age AgeFmt.;
run;
proc glm data=Sashelp.Class;
format Age AgeFmt.;
class Age;
model weight = Age / solution;
run;