Count Methodology for imputation?

edasdfasdfasdfa · Posted 04-30-2019 06:14 PM

I read the following (below) in some article on here:

For categorical variables, the most common methodology is “count” wherein you fill the missing values with the most common level of the categorical variable.

How is this performed? I can't find any information on it.

ballardw · Posted 04-30-2019 06:26 PM

One very crude method: Proc Freq plus a data step. Find the most frequent occurrence using proc freq then something like:

Data want;

set have;

if missing(var) then var='mostcommonvalue';

run;

Similar for replacing with a Mean value, proc means/summary to get the mean and replace missing values.

edasdfasdfasdfa · Posted 04-30-2019 06:29 PM

For numeric variables, you can use proc stdize but I have never seen documentation on character variables.

Ie

proc stdize data=train

method=median out=traini

var var1

run;

Reeza · Posted 04-30-2019 06:39 PM

You need to first understand how and why the values are missing before you can say what an appropriate method is. Using the largest group isn't a great method. An alternative is to actually model the data to predict the category - using logistic regression or discriminant analysis. These are both covered in PROC MI and both have examples in the documentation, 79.4 & 79.5 Examples

https://documentation.sas.com/?docsetId=statug&docsetTarget=statug_mi_examples04.htm&docsetVersion=1...

@edasdfasdfasdfa wrote:

I read the following (below) in some article on here:

For categorical variables, the most common methodology is “count” wherein you fill the missing values with the most common level of the categorical variable.

How is this performed? I can't find any information on it.

Count Methodology for imputation?

Re: Count Methodology for imputation?

Re: Count Methodology for imputation?

Re: Count Methodology for imputation?

Count Methodology for imputation?

Re: Count Methodology for imputation?

Re: Count Methodology for imputation?

Re: Count Methodology for imputation?

SAS Innovate 2025: Call for Content