## How to fill missing value less of variables with the number of missing value less than 3.

Hi there,

I really need help in filling the missing value.

I want to fill the missing value of value1, value2 and value3 by ''firm'' with last observation.

But I just want to fill the variable of each firm that with the missing value that less than 5.

data test;
input firm \$2. year Month value1 value2 value3;
cards;
A 2006 03 7 7 5
A 2006 03 8 8 .
A 2006 03 . 9 1
A 2006 06 5 . .
A 2006 06 8 7 4
A 2006 06 . 4 3
A 2006 09 . . 2
A 2006 09 9 7 1
A 2006 09 . . 5
A 2007 03 . 7 .
A 2007 03 . 7 8
A 2007 03 . 7 .
A 2007 06 . 3 2
A 2007 06 . 7 6
A 2007 06 . 8 .
B 2006 03 . 2 1
B 2006 03 12 7 .
B 2006 03 10 . .
B 2007 03 9 1 2
B 2007 03 1 . .
B 2007 03 . 3 4
B 2007 03 . . .
B 2007 03 . 3 4
;
run;

data want:

A 2006 03 7 7 5
A 2006 03 8 8 5
A 2006 03 . 9 1
A 2006 06 5 9 1
A 2006 06 8 7 4
A 2006 06 . 4 3
A 2006 09 . 4 2
A 2006 09 9 7 1
A 2006 09 . 7 5
A 2007 03 . 7 5
A 2007 03 . 7 8
A 2007 03 . 7 8
A 2007 06 . 3 2
A 2007 06 . 7 6
A 2007 06 . 8 6
B 2006 03 . 2 1
B 2006 03 12 7 1
B 2006 03 10 7 1
B 2007 03 9 1 2
B 2007 03 1 1 2
B 2007 03 1 3 4
B 2007 03 1 3 4
B 2007 03 1 3 4
;
run;

So there are two requirements when filling the missing value:

1. Fill the missing value of each variable by 'firm' with previous data.

2. If the number of missing value > 5, then we do not fill that variable. In this case, because the number of the missing value of value1 of firm A > 5, then it will not be filled by the lasted data.

Thank you so much.

2 REPLIES 2

## Re: How to fill missing value less of variables with the number of missing value less than 3.

``````data test;
input firm \$2. year Month value1 value2 value3;
cards;
A 2006 03 7 7 5
A 2006 03 8 8 .
A 2006 03 . 9 1
A 2006 06 5 . .
A 2006 06 8 7 4
A 2006 06 . 4 3
A 2006 09 . . 2
A 2006 09 9 7 1
A 2006 09 . . 5
A 2007 03 . 7 .
A 2007 03 . 7 8
A 2007 03 . 7 .
A 2007 06 . 3 2
A 2007 06 . 7 6
A 2007 06 . 8 .
B 2006 03 . 2 1
B 2006 03 12 7 .
B 2006 03 10 . .
B 2007 03 9 1 2
B 2007 03 1 . .
B 2007 03 . 3 4
B 2007 03 . . .
B 2007 03 . 3 4
;
run;

proc sort data=test;;
by firm year month;
run;

data want;
set test;
by firm year month;
retain newvalue1 newvalue2 newvalue3;
if first.firm then do;newvalue1=.;newvalue2=.;newvalue3=.;end;
if value1 ne . then newvalue1=value1;
if value2 ne . then newvalue2=value2;
if value3 ne . then newvalue3=value3;
if value1 eq . and newvalue1 >5 then newvalue1=.;
run;``````
Thanks,
Jag

## Re: How to fill missing value less of variables with the number of missing value less than 3.

Untested but should work:

``````proc summary data=have;
by firm;
var value1 value2 value3;
output out=missing_count (drop=_type_ _freq_) nmiss=mis_val1 mis_val2 mis_val3;
run;

data want;
merge have missing_count;
by firm;
if first.firm then do;
replacement_val1 = .;
replacement_val2 = .;
replacement_val3 = .;
end;
retain replacement_val: ;
drop replacement_val:  mis_val: ;
if mis_val1 <= 5 and value1 > . then replacement_val1 = value1;
if mis_val2 <= 5 and value2 > . then replacement_val2 = value2;
if mis_val3 <= 5 and value3 > . then replacement_val3 = value3;
if value1 = . then value1 = replacement_val1;
if value2 = . then value2 = replacement_val2;
if value3 = . then value3 = replacement_val3;
run;``````

Test it and see if you run into any problems.

Discussion stats
• 2 replies
• 261 views
• 0 likes
• 3 in conversation