12-03-2013 12:15 PM
I am preparing data for Survival analysis.I have noticed that my data (predictor variables) has many missing values (some variables 15% of data is missing observations).
Do I need to delete missing observations?
12-04-2013 11:01 AM
Most procedures by default will ignore missing values and the regression type procedures will generally ignore records with "required" variables, such as predictors, that have missing values. You will generally get a diagnostic that says something along the lines of "n records used". That is why SAS has the special value of "missing".
If you want to test this behavior try running the procedure twice once with all of the data and again selecting records without the problem:
data =yourdataset name (where=(not missing(variablename ))) ;
The where dataset option can, when the selection is not too complex, filter the data for tests like this to see if the results change without modifying the dataset. Some procedures also have a separate where statement that can be a bit more flexible than the dataset option as it may allow use of functions involving one or more variables instead of just a list of values.
I wouldn't delete the records for the first passes through the data.
You may want to investigate imputation if too many variable combinations are missing.