DATA Step, Macro, Functions and more

Dealing with missing data

Occasional Contributor
Posts: 10

Dealing with missing data


I am preparing data for Survival analysis.I have noticed that my data (predictor variables) has many missing values (some variables 15%  of data is missing observations).

Do I need to delete missing observations?


Best regards.

Super User
Posts: 13,583

Re: Dealing with missing data

Most procedures by default will ignore missing values and the regression type procedures will generally ignore records with "required" variables, such as predictors, that have missing values. You will generally get a diagnostic that says something along the lines of "n records used". That is why SAS has the special value of "missing".

If you want to test this behavior try running the procedure twice once with all of the data and again selecting records without the problem:

data =yourdataset name (where=(not missing(variablename ))) ;

The where dataset option can, when the selection is not too complex, filter the data for tests like this to see if the results change without modifying the dataset. Some procedures also have a separate where statement that can be a bit more flexible than the dataset option as it may allow use of functions involving one or more variables instead of just a list of values.

I wouldn't delete the records for the first passes through the data.

You may want to investigate imputation if too many variable combinations are missing.

Ask a Question
Discussion stats
  • 1 reply
  • 2 in conversation