Hi there,
I really need help in filling the missing value.
I want to fill the missing value of value1, value2 and value3 by ''firm'' with last observation.
But I just want to fill the variable of each firm that with the missing value that less than 5.
data test;
input firm $2. year Month value1 value2 value3;
cards;
A 2006 03 7 7 5
A 2006 03 8 8 .
A 2006 03 . 9 1
A 2006 06 5 . .
A 2006 06 8 7 4
A 2006 06 . 4 3
A 2006 09 . . 2
A 2006 09 9 7 1
A 2006 09 . . 5
A 2007 03 . 7 .
A 2007 03 . 7 8
A 2007 03 . 7 .
A 2007 06 . 3 2
A 2007 06 . 7 6
A 2007 06 . 8 .
B 2006 03 . 2 1
B 2006 03 12 7 .
B 2006 03 10 . .
B 2007 03 9 1 2
B 2007 03 1 . .
B 2007 03 . 3 4
B 2007 03 . . .
B 2007 03 . 3 4
;
run;
data want:
A 2006 03 7 7 5
A 2006 03 8 8 5
A 2006 03 . 9 1
A 2006 06 5 9 1
A 2006 06 8 7 4
A 2006 06 . 4 3
A 2006 09 . 4 2
A 2006 09 9 7 1
A 2006 09 . 7 5
A 2007 03 . 7 5
A 2007 03 . 7 8
A 2007 03 . 7 8
A 2007 06 . 3 2
A 2007 06 . 7 6
A 2007 06 . 8 6
B 2006 03 . 2 1
B 2006 03 12 7 1
B 2006 03 10 7 1
B 2007 03 9 1 2
B 2007 03 1 1 2
B 2007 03 1 3 4
B 2007 03 1 3 4
B 2007 03 1 3 4
;
run;
So there are two requirements when filling the missing value:
1. Fill the missing value of each variable by 'firm' with previous data.
2. If the number of missing value > 5, then we do not fill that variable. In this case, because the number of the missing value of value1 of firm A > 5, then it will not be filled by the lasted data.
Thank you so much.
Please try
data test;
input firm $2. year Month value1 value2 value3;
cards;
A 2006 03 7 7 5
A 2006 03 8 8 .
A 2006 03 . 9 1
A 2006 06 5 . .
A 2006 06 8 7 4
A 2006 06 . 4 3
A 2006 09 . . 2
A 2006 09 9 7 1
A 2006 09 . . 5
A 2007 03 . 7 .
A 2007 03 . 7 8
A 2007 03 . 7 .
A 2007 06 . 3 2
A 2007 06 . 7 6
A 2007 06 . 8 .
B 2006 03 . 2 1
B 2006 03 12 7 .
B 2006 03 10 . .
B 2007 03 9 1 2
B 2007 03 1 . .
B 2007 03 . 3 4
B 2007 03 . . .
B 2007 03 . 3 4
;
run;
proc sort data=test;;
by firm year month;
run;
data want;
set test;
by firm year month;
retain newvalue1 newvalue2 newvalue3;
if first.firm then do;newvalue1=.;newvalue2=.;newvalue3=.;end;
if value1 ne . then newvalue1=value1;
if value2 ne . then newvalue2=value2;
if value3 ne . then newvalue3=value3;
if value1 eq . and newvalue1 >5 then newvalue1=.;
run;
Untested but should work:
proc summary data=have;
by firm;
var value1 value2 value3;
output out=missing_count (drop=_type_ _freq_) nmiss=mis_val1 mis_val2 mis_val3;
run;
data want;
merge have missing_count;
by firm;
if first.firm then do;
replacement_val1 = .;
replacement_val2 = .;
replacement_val3 = .;
end;
retain replacement_val: ;
drop replacement_val: mis_val: ;
if mis_val1 <= 5 and value1 > . then replacement_val1 = value1;
if mis_val2 <= 5 and value2 > . then replacement_val2 = value2;
if mis_val3 <= 5 and value3 > . then replacement_val3 = value3;
if value1 = . then value1 = replacement_val1;
if value2 = . then value2 = replacement_val2;
if value3 = . then value3 = replacement_val3;
run;
Test it and see if you run into any problems.
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.