BookmarkSubscribeRSS Feed
Xiaoningdemao
Quartz | Level 8

Dear All,

 

Here I have a question about variale selection. Say I have the following data set:

 

Age    Sex    weight     Pass

15          0           70           1

17          1              -           0

16           -           60           1   ;

 

And I want to use logistic regression to predice pass using age, sex and weight . also I want to use backwards variable selection to selsect significant covariate. So I coded as following:

 

proc logistic data=have descending;

model pass= age sex weight /selection=backward fast;

run;

 

However, since I have missing data here. The program only use the entry that has complete observation based on full model, therefore the first enrty.

 

But I want it to enclude more observation as eleminate covariates, that is when model only with age and sex, i want it to use first two entries. Is there a way to realize this in SAS?

 

Thank you very much!!!

 

Best,

 

 

 

HereH

2 REPLIES 2
ballardw
Super User

The procedure will only use records that have values for all the model variables.

 

You might try imputation to replace missing values. There are a number of ways but if you have significant percentage of missing values for a given variable then your resulting model is going to be suspect.

 

What percentage of your records are missing one or more of the variables?

PGStats
Opal | Level 21

You could screen your predictors with PROC HPSPLIT. It offers three methods for dealing with missing predictors. The output decision tree is a crude model but will show you which variables are the most important predictors.

PG

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 2 replies
  • 1231 views
  • 2 likes
  • 3 in conversation