BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Maomao_Hui
Calcite | Level 5

Hi there,

 

I really need help in filling the missing value.

I want to fill the missing value of value1, value2 and value3 by ''firm'' with last observation.

But I just want to fill the variable of each firm that with the missing value that less than 5.

 

data test;
input firm $2. year Month value1 value2 value3;
cards;
A 2006 03 7 7 5
A 2006 03 8 8 .
A 2006 03 . 9 1
A 2006 06 5 . .
A 2006 06 8 7 4
A 2006 06 . 4 3
A 2006 09 . . 2
A 2006 09 9 7 1
A 2006 09 . . 5
A 2007 03 . 7 .
A 2007 03 . 7 8
A 2007 03 . 7 .
A 2007 06 . 3 2
A 2007 06 . 7 6
A 2007 06 . 8 .
B 2006 03 . 2 1
B 2006 03 12 7 .
B 2006 03 10 . .
B 2007 03 9 1 2
B 2007 03 1 . .
B 2007 03 . 3 4
B 2007 03 . . .
B 2007 03 . 3 4
;
run;

 

data want:

A 2006 03 7 7 5
A 2006 03 8 8 5
A 2006 03 . 9 1
A 2006 06 5 9 1
A 2006 06 8 7 4
A 2006 06 . 4 3
A 2006 09 . 4 2
A 2006 09 9 7 1
A 2006 09 . 7 5
A 2007 03 . 7 5
A 2007 03 . 7 8
A 2007 03 . 7 8
A 2007 06 . 3 2
A 2007 06 . 7 6
A 2007 06 . 8 6
B 2006 03 . 2 1
B 2006 03 12 7 1
B 2006 03 10 7 1
B 2007 03 9 1 2
B 2007 03 1 1 2
B 2007 03 1 3 4
B 2007 03 1 3 4
B 2007 03 1 3 4
;
run;

 

So there are two requirements when filling the missing value: 

1. Fill the missing value of each variable by 'firm' with previous data.

2. If the number of missing value > 5, then we do not fill that variable. In this case, because the number of the missing value of value1 of firm A > 5, then it will not be filled by the lasted data.

 

Thank you so much.

 

 

 

1 ACCEPTED SOLUTION

Accepted Solutions
ChrisNZ
Tourmaline | Level 20

Something like this should work (untested, I don't have SAS handy) :

 

data WANT; 
  do until(last.FIRM);
   set HAVE; 
   by FIRM;
   VAL1NMISS=sum(VAL1NMISS, missing(VALUE1));
   VAL2NMISS=sum(VAL2NMISS, missing(VALUE2));
   VAL3NMISS=sum(VAL3NMISS, missing(VALUE3));
  end;
  do until(last.FIRM);
   set HAVE; 
   by FIRM;
   VALUE1=ifn(VAL1NMISS<5 & VALUE1=. & ^first.FIRM, lag(VALUE1), VALUE1);
   VALUE2=ifn(VAL2NMISS<5 & VALUE2=. & ^first.FIRM, lag(VALUE2), VALUE2);
   VALUE3=ifn(VAL3NMISS<5 & VALUE3=. & ^first.FIRM, lag(VALUE3), VALUE3);
   output;
  end;
run;

View solution in original post

5 REPLIES 5
ErikLund_Jensen
Rhodochrosite | Level 12

Hi @Maomao_Hui 

 

You don't specify it, byt it seems from your want-data that rule 2 applies only to value1. The following code gives your wanted result:

 

data test;
input firm $2. year Month value1 value2 value3;
cards;
A 2006 03 7 7 5
A 2006 03 8 8 .
A 2006 03 . 9 1
A 2006 06 5 . .
A 2006 06 8 7 4
A 2006 06 . 4 3
A 2006 09 . . 2
A 2006 09 9 7 1
A 2006 09 . . 5
A 2007 03 . 7 .
A 2007 03 . 7 8
A 2007 03 . 7 .
A 2007 06 . 3 2
A 2007 06 . 7 6
A 2007 06 . 8 .
B 2006 03 . 2 1
B 2006 03 12 7 .
B 2006 03 10 . .
B 2007 03 9 1 2
B 2007 03 1 . .
B 2007 03 . 3 4
B 2007 03 . . .
B 2007 03 . 3 4
;
run;

data want; set test; by firm;
	retain ovalue1 ovalue2 ovalue3;
	drop ovalue1 ovalue2 ovalue3;
	if value1 = . then value1 = ovalue1;
	if value2 = . then value2 = ovalue2;
	if value3 = . then value3 = ovalue3;
	if value1 < 5 then ovalue1 = value1; else ovalue1 = .;
	if value2 > . then ovalue2 = value2; else ovalue2 = .;
	if value3 > . then ovalue3 = value3; else ovalue3 = .;
run;
Maomao_Hui
Calcite | Level 5

Hi @ErikLund_Jensen ,

 

Thank you so much for your kind reply. I am so sorry that I am not clear enough. Actually, I just want to fill the variables with few missing values. For example, the number of the missing value of value1 of firm A is 10, so I ignore it.

Firm A, the number of the missing value of value1 is 10, not fill it.  The number of the missing value of value 2 is 3, fill it. The number of the missing value of value 3 of firm A is 5, so should also not be filled.

Firm B, the number of the missing value of value1 is 3, fill it.  The number of the missing value of value 2 is 4, fill it. The number of the missing value of value 3 is 4, fill it.

Hope this time my explanation would be clearer, and thanks again for your quick reply.

 

Thank you,

 

Hui

 

data test;
input firm $2. year Month value1 value2 value3;
cards;A 2006 03 7 7 5
A 2006 03 8 8 .
A 2006 03 . 9 1
A 2006 06 5 . .
A 2006 06 8 7 4
A 2006 06 . 4 3
A 2006 09 . . 2
A 2006 09 9 7 1
A 2006 09 . . 5
A 2007 03 . 7 .
A 2007 03 . 7 8
A 2007 03 . 7 .
A 2007 06 . 3 2
A 2007 06 . 7 6
A 2007 06 . 8 .
B 2006 03 . 2 1
B 2006 03 12 7 .
B 2006 03 10 . .
B 2007 03 9 1 2
B 2007 03 1 . .
B 2007 03 . 3 4
B 2007 03 . . .
B 2007 03 . 3 4;
run;

data want;
A 2006 03 7 7 5 A 2006 03 8 8 . A 2006 03 . 9 1 A 2006 06 5 9 . A 2006 06 8 7 4 A 2006 06 . 4 3 A 2006 09 . 4 2 A 2006 09 9 7 1 A 2006 09 . 7 5 A 2007 03 . 7 . A 2007 03 . 7 8 A 2007 03 . 7 . A 2007 06 . 3 2 A 2007 06 . 7 6 A 2007 06 . 8 . B 2006 03 . 2 1 B 2006 03 12 7 1 B 2006 03 10 7 1 B 2007 03 9 1 2 B 2007 03 1 1 2 B 2007 03 1 3 4 B 2007 03 1 3 4 B 2007 03 1 3 4


 

ChrisNZ
Tourmaline | Level 20

Something like this should work (untested, I don't have SAS handy) :

 

data WANT; 
  do until(last.FIRM);
   set HAVE; 
   by FIRM;
   VAL1NMISS=sum(VAL1NMISS, missing(VALUE1));
   VAL2NMISS=sum(VAL2NMISS, missing(VALUE2));
   VAL3NMISS=sum(VAL3NMISS, missing(VALUE3));
  end;
  do until(last.FIRM);
   set HAVE; 
   by FIRM;
   VALUE1=ifn(VAL1NMISS<5 & VALUE1=. & ^first.FIRM, lag(VALUE1), VALUE1);
   VALUE2=ifn(VAL2NMISS<5 & VALUE2=. & ^first.FIRM, lag(VALUE2), VALUE2);
   VALUE3=ifn(VAL3NMISS<5 & VALUE3=. & ^first.FIRM, lag(VALUE3), VALUE3);
   output;
  end;
run;
Maomao_Hui
Calcite | Level 5

Thank you so much. I also made a bit adjustment as follow:

 

data WANT;
do until(last.FIRM);
set HAVE;
by FIRM;
VAL1NMISS=sum(VAL1NMISS, missing(VALUE1));
VAL2NMISS=sum(VAL2NMISS, missing(VALUE2));
VAL3NMISS=sum(VAL3NMISS, missing(VALUE3));
end;
do until(last.FIRM);
set HAVE;
by FIRM;
VALUE1=ifn(VAL1NMISS<5 & VALUE1=. & ^first.FIRM, coalesce(VALUE1,n1), VALUE1);n1=VALUE1;
VALUE2=ifn(VAL2NMISS<5 & VALUE2=. & ^first.FIRM, coalesce(VALUE2,n2), VALUE2);n2=VALUE2;
VALUE3=ifn(VAL3NMISS<5 & VALUE3=. & ^first.FIRM, coalesce(VALUE3,n3), VALUE3);n3=VALUE3;
output;
end;
run;

ErikLund_Jensen
Rhodochrosite | Level 12

Hi @Maomao_Hui 

 

I think the code I supplied will create your new want-data set, if the rule is applied to all 3 variables:

 

data want; set test; by firm;
	retain ovalue1 ovalue2 ovalue3;
	drop ovalue1 ovalue2 ovalue3;
	if value1 = . then value1 = ovalue1;
	if value2 = . then value2 = ovalue2;
	if value3 = . then value3 = ovalue3;
	if value1 < 5 then ovalue1 = value1; else ovalue1 = .;
	if value2 < 5 then ovalue2 = value2; else ovalue2 = .;
	if value3 < 5 then ovalue3 = value3; else ovalue3 = .;
run;

 

 

sas-innovate-wordmark-2025-midnight.png

Register Today!

Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.


Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 5 replies
  • 1963 views
  • 0 likes
  • 3 in conversation