07-28-2016 06:07 PM - last edited on 07-28-2016 06:15 PM by Reeza
Here I have a question about variale selection. Say I have the following data set:
Age Sex weight Pass
15 0 70 1
17 1 - 0
16 - 60 1 ;
And I want to use logistic regression to predice pass using age, sex and weight . also I want to use backwards variable selection to selsect significant covariate. So I coded as following:
proc logistic data=have descending;
model pass= age sex weight /selection=backward fast;
However, since I have missing data here. The program only use the entry that has complete observation based on full model, therefore the first enrty.
But I want it to enclude more observation as eleminate covariates, that is when model only with age and sex, i want it to use first two entries. Is there a way to realize this in SAS?
Thank you very much!!!
07-28-2016 06:47 PM
The procedure will only use records that have values for all the model variables.
You might try imputation to replace missing values. There are a number of ways but if you have significant percentage of missing values for a given variable then your resulting model is going to be suspect.
What percentage of your records are missing one or more of the variables?
07-28-2016 10:25 PM
You could screen your predictors with PROC HPSPLIT. It offers three methods for dealing with missing predictors. The output decision tree is a crude model but will show you which variables are the most important predictors.