Hi there,
I really need help in filling the missing value.
I want to fill the missing value of value1, value2 and value3 by ''firm'' with last observation.
But I just want to fill the variable of each firm that with the missing value that less than 5.
data test;
input firm $2. year Month value1 value2 value3;
cards;
A 2006 03 7 7 5
A 2006 03 8 8 .
A 2006 03 . 9 1
A 2006 06 5 . .
A 2006 06 8 7 4
A 2006 06 . 4 3
A 2006 09 . . 2
A 2006 09 9 7 1
A 2006 09 . . 5
A 2007 03 . 7 .
A 2007 03 . 7 8
A 2007 03 . 7 .
A 2007 06 . 3 2
A 2007 06 . 7 6
A 2007 06 . 8 .
B 2006 03 . 2 1
B 2006 03 12 7 .
B 2006 03 10 . .
B 2007 03 9 1 2
B 2007 03 1 . .
B 2007 03 . 3 4
B 2007 03 . . .
B 2007 03 . 3 4
;
run;
data want:
A 2006 03 7 7 5
A 2006 03 8 8 5
A 2006 03 . 9 1
A 2006 06 5 9 1
A 2006 06 8 7 4
A 2006 06 . 4 3
A 2006 09 . 4 2
A 2006 09 9 7 1
A 2006 09 . 7 5
A 2007 03 . 7 5
A 2007 03 . 7 8
A 2007 03 . 7 8
A 2007 06 . 3 2
A 2007 06 . 7 6
A 2007 06 . 8 6
B 2006 03 . 2 1
B 2006 03 12 7 1
B 2006 03 10 7 1
B 2007 03 9 1 2
B 2007 03 1 1 2
B 2007 03 1 3 4
B 2007 03 1 3 4
B 2007 03 1 3 4
;
run;
So there are two requirements when filling the missing value:
1. Fill the missing value of each variable by 'firm' with previous data.
2. If the number of missing value > 5, then we do not fill that variable. In this case, because the number of the missing value of value1 of firm A > 5, then it will not be filled by the lasted data.
Thank you so much.
Please try
data test;
input firm $2. year Month value1 value2 value3;
cards;
A 2006 03 7 7 5
A 2006 03 8 8 .
A 2006 03 . 9 1
A 2006 06 5 . .
A 2006 06 8 7 4
A 2006 06 . 4 3
A 2006 09 . . 2
A 2006 09 9 7 1
A 2006 09 . . 5
A 2007 03 . 7 .
A 2007 03 . 7 8
A 2007 03 . 7 .
A 2007 06 . 3 2
A 2007 06 . 7 6
A 2007 06 . 8 .
B 2006 03 . 2 1
B 2006 03 12 7 .
B 2006 03 10 . .
B 2007 03 9 1 2
B 2007 03 1 . .
B 2007 03 . 3 4
B 2007 03 . . .
B 2007 03 . 3 4
;
run;
proc sort data=test;;
by firm year month;
run;
data want;
set test;
by firm year month;
retain newvalue1 newvalue2 newvalue3;
if first.firm then do;newvalue1=.;newvalue2=.;newvalue3=.;end;
if value1 ne . then newvalue1=value1;
if value2 ne . then newvalue2=value2;
if value3 ne . then newvalue3=value3;
if value1 eq . and newvalue1 >5 then newvalue1=.;
run;
Untested but should work:
proc summary data=have;
by firm;
var value1 value2 value3;
output out=missing_count (drop=_type_ _freq_) nmiss=mis_val1 mis_val2 mis_val3;
run;
data want;
merge have missing_count;
by firm;
if first.firm then do;
replacement_val1 = .;
replacement_val2 = .;
replacement_val3 = .;
end;
retain replacement_val: ;
drop replacement_val: mis_val: ;
if mis_val1 <= 5 and value1 > . then replacement_val1 = value1;
if mis_val2 <= 5 and value2 > . then replacement_val2 = value2;
if mis_val3 <= 5 and value3 > . then replacement_val3 = value3;
if value1 = . then value1 = replacement_val1;
if value2 = . then value2 = replacement_val2;
if value3 = . then value3 = replacement_val3;
run;
Test it and see if you run into any problems.
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.