BookmarkSubscribeRSS Feed
Anandkvn
Lapis Lazuli | Level 10
data ds_missing_values;
	infile datalines dlm=',';
	input group $ var1 var2 var3;
	datalines;
A, 1, 2, 3
A, ., 2, 5
A, 3, 4, .
A, 8, ., .
B, ., ., .
B, ., 2, 6
B, 3, 1, 8
C, 9, 4, 8

;
run;

How to replace missing values with next values for repective variables missings output like below  

A 1 2 3
A 3 2 5
A 8 4 6
A 3 2 6
B 3 2 6
B 3 2 8
C 9 1 8
5 REPLIES 5
Kurt_Bremser
Super User

Why does group change to "A" in every observation?

 


@Anandkvn wrote:
data ds_missing_values;
	infile datalines dlm=',';
	input group $ var1 var2 var3;
	datalines;
A, 1, 2, 3
A, ., 2, 5
A, 3, 4, .
A, 8, ., .
B, ., ., .
B, ., 2, 6
B, 3, 1, 8
C, 9, 4, 8

;
run;

How to replace missing values with next values for repective variables missings output like below  

A 1 2 3
A 3 2 5
A 8 4 6
A 3 2 6
A 3 2 6
A 3 2 8
A 9 1 8

 

ballardw
Super User

When you have a variable like your Group variable it seems unlikely that you should acquire values from a different group to replace missing.

 

Can you describe the purpose of replacing these values? How will you use the data? That may give us some other ideas.

 


@Anandkvn wrote:
data ds_missing_values;
	infile datalines dlm=',';
	input group $ var1 var2 var3;
	datalines;
A, 1, 2, 3
A, ., 2, 5
A, 3, 4, .
A, 8, ., .
B, ., ., .
B, ., 2, 6
B, 3, 1, 8
C, 9, 4, 8

;
run;

How to replace missing values with next values for repective variables missings output like below  

A 1 2 3
A 3 2 5
A 8 4 6
A 3 2 6
B 3 2 6
B 3 2 8
C 9 1 8

 

mkeintz
Jade | Level 19

In group A, your sample input shows the VAR1 sequence to be   1, ., 3,. 8.   

 

Why does your sample output show  1,3,8,3?  Or are you prepared to take the next VAR1 from group B?  And it looks like you are MOVING a future non-missing value back, not just copying it.  Is that right

 

And why does the sample input have 3 group B observations, but the sample output have only 2?

 

 

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
Patrick
Opal | Level 21

And additionally to questions already asked: What should happen in if the last row in a group has a missing value? Leave it missing or something else?

mkeintz
Jade | Level 19

In the absence of a reply, here's some code that solves what I presume is the typical "bringing back the future" problem.  For each record, a missing value is replaced by the earliest following non-missing value, within the same group.  This means of course that any group that ends with missing values will not have those values replaced:

 

data ds_missing_values;
	infile datalines dlm=',';
	input group $ var1 var2 var3;
	datalines;
A, 1, 2, 3
A, ., 2, 5
A, 3, 4, .
A, 8, ., .
B, ., ., .
B, ., 2, 6
B, 3, 1, 8
C, 9, 4, 8
run;

data want (drop=_:);
  set ds_missing_values (in=firstpass)
      ds_missing_values (in=secondpass);
  by group;

  array var {3} var1-var3;

  array history {3,20}  _histv1_1-_histv1_20  
                        _histv2_1-_histv2_20  
                        _histv3_1-_histv3_20 ;
  retain _: ;

  if first.group then call missing(_f,_s,of _:);
  _f+firstpass;

  if firstpass then do _v=1 to 3;
    history{_v,_f}=var{_v};
  end;

  if secondpass;
  _s+secondpass;
  call missing(history{1,_s},history{2,_s},history{3,_s});
  if var1=. then var1=coalesce(of _histv1:);
  if var2=. then var2=coalesce(of _histv2:);
  if var3=. then var3=coalesce(of _histv3:);
run;

The program passes through eac group twice: the first time to build a history (up to 20 obs in this case) of values for VAR1, VAR2, and VAR3.  The second pass will tap into that history, replacing a missing value with the earliest following non-missing (that's what the coalesce functiom does).  Also during the second pass, current data in the history array is eliminated, so only future values are exposed to the coalesce functions.

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------

SAS INNOVATE 2024

Innovate_SAS_Blue.png

Registration is open! SAS is returning to Vegas for an AI and analytics experience like no other! Whether you're an executive, manager, end user or SAS partner, SAS Innovate is designed for everyone on your team. Register for just $495 by 12/31/2023.

If you are interested in speaking, there is still time to submit a session idea. More details are posted on the website. 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Get the $99 certification deal.jpg

 

 

Back in the Classroom!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 5 replies
  • 241 views
  • 1 like
  • 5 in conversation