Home
- /
SAS Programming
- /
Base SAS Programming
- /
Missing value

02-03-2018 02:38 PM

I am handling a cross-sectional data set.

I identified that a number of my variable of interests were missing values, including continued, bi-variate and ordinal data.

I am wondering if I can replace the missing values by its mean or median.

And I want to know if there is a cap for allowing the replacing its value, e.g., missing values must less than 10% or so.

Anyone can help much appreciated.

Phan S.

Accepted Solutions

Solution

02-03-2018
03:50 PM

Posted in reply to PhanS

02-03-2018 03:32 PM

Here's an inefficient macro that will cap the outliers.

https://gist.github.com/statgeek/31316a678433a1db8136

PROC STDIZE is a better option for replacing with median/mean. You can also look into PROC MI, multiple imputation to impute missing data.

PROC STDIZE is a better option for replacing with median/mean. You can also look into PROC MI, multiple imputation to impute missing data.

All Replies

Posted in reply to PhanS

02-03-2018 02:51 PM

Hi,

Missing values can be replaced with various statistics using proc stdize. Below is an example replacing missing values with median.

Defining a cap would be based on your analysis. You can flag variables containing a certain percentage of missing values for imputation.

proc stdize data=have reponly method=median out=imputed;

var a b c; /* Assuming a, b and c are 3 numeric variables */

run;

Posted in reply to stat_sas

02-03-2018 03:59 PM

Hello,

I am sorry, I mean I give the solution (credit) to you, bu accidentally check to Reeza.

Thank you for you code.

Phan S.

Posted in reply to PhanS

02-03-2018 04:31 PM

Posted in reply to PhanS

02-03-2018 10:25 PM

