BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
lichee
Quartz | Level 8

I'm trying to replace missing values with the values of the previous row before the row with missing. The example data is as below: the data set "have" is the data I have, and "want" is the target data I want to generate. The idea is that when the fields ID and var1 have missing values, the row with missing takes the value from the row with value before the row with missing value.

Thank you very much!

 

data have;
infile datalines truncover dsd;
input Person_ID var1 $ var2 $;
datalines;
1,aa,aa1
,,aa2
2,bb,bb
3,cc,cc
4,dd,dd
5,ee,ee
6,ff,ff
7,gg,gg1
,,gg2
,,gg3
8,hh,hh1
,,hh2
,,hh3
;
run;

data want;
infile datalines truncover dsd;
input Person_ID var1 $ var2 $;
datalines;
1,aa,aa1
1,aa,aa2
2,bb,bb
3,cc,cc
4,dd,dd
5,ee,ee
6,ff,ff
7,gg,gg1
7,gg,gg2
7,gg,gg3
8,hh,hh1
8,hh,hh2
8,hh,hh3
;
run;

 

1 ACCEPTED SOLUTION

Accepted Solutions
data_null__
Jade | Level 19

Here you go.

 

data have;
infile datalines truncover dsd;
input Person_ID var1 $ var2 $ var3 $ var4;
datalines;
1,aa,aa1,aaa,
,,aa2,,10
2,bb,bb,bbb,6
3,cc,cc,ccc,8
4,dd,dd,ddd,
5,ee,ee,eee,7
6,ff,ff,fff,
7,gg,gg1,ggg,3
,,gg2,,5
,,gg3,,6
8,hh,hh1,hhh,
,,hh2,,
,,hh3,,
;
run;

data have2;
   retain dummy 1;
   set have;
   run;
proc print; run;
data want;
   if 0 then set have2;
   update have2(keep=dummy obs=0) have2(keep=dummy Person_id var1-var2);
   by dummy;
   set have2(keep=var3-var4);
   output;
   drop dummy;
   run;
proc print;
   run;

Capture.PNG

View solution in original post

8 REPLIES 8
data_null__
Jade | Level 19

Try the update trick.

 

data have;
	infile datalines truncover dsd;
	input Person_ID var1 $ var2 $;
	datalines;
1,aa,aa1
,,aa2
2,bb,bb
3,cc,cc
4,dd,dd
5,ee,ee
6,ff,ff
7,gg,gg1
,,gg2
,,gg3
8,hh,hh1
,,hh2
,,hh3
;
run;

data have2;
   retain dummy  1;
   set have;
   run;

data want;
   update have2(obs=0) have2;
   by dummy;
   output;
   drop dummy;
   run;
proc print; run;
lichee
Quartz | Level 8
Thanks a lot!

The code works for the dummy data. What if there are many other fields with missing values but other fields do not need to be populated for their missing values in the same way? For example, if I only wanted to populate var1 but not var2, can we specify fields to be populated? Thanks again!
data_null__
Jade | Level 19

Use the UPDATE/BY statements and keep (data set options) only the variables that you want to carry forward.  Then use SET dropping the variables you kept in the UPDATE statement.

 

If you supply some sample data I will make example.

 

if 0 then set have2;
update have2(keep=dummy obs=0) have2(keep=dummy x y z);
by dummy;
set have2(drop=dummy x y z);
lichee
Quartz | Level 8

The dummy data would look like below. Basically only Person_ID and var1 need to be populated (as from data have to data want), but not var3 or var4 even thought they have missing values as well.

data have;
infile datalines truncover dsd;
input Person_ID var1 $ var2 $ var3 $ var4;
datalines;
ID,var1,var2,var3,var4
1,aa,aa1,aaa,
,,aa2,,10
2,bb,bb,bbb,6
3,cc,cc,ccc,8
4,dd,dd,ddd,
5,ee,ee,eee,7
6,ff,ff,fff,
7,gg,gg1,ggg,3
,,gg2,,5
,,gg3,,6
8,hh,hh1,hhh,
,,hh2,,
,,hh3,,
;
run;

 

data want;
infile datalines truncover dsd;
input Person_ID var1 $ var2 $ var3 $ var4;
datalines;
1,aa,aa1,aaa,
1,aa,aa2,,10
2,bb,bb,bbb,6
3,cc,cc,ccc,8
4,dd,dd,ddd,
5,ee,ee,eee,7
6,ff,ff,fff,
7,gg,gg1,ggg,3
7,gg,gg2,,5
7,gg,gg3,,6
8,hh,hh1,hhh,
8,hh,hh2,,
8,hh,hh3,,
;
run;

 

lichee
Quartz | Level 8

Sorry! Data have should be as below:

 

data have;
infile datalines truncover dsd;
input Person_ID var1 $ var2 $ var3 $ var4;
datalines;
1,aa,aa1,aaa,
,,aa2,,10
2,bb,bb,bbb,6
3,cc,cc,ccc,8
4,dd,dd,ddd,
5,ee,ee,eee,7
6,ff,ff,fff,
7,gg,gg1,ggg,3
,,gg2,,5
,,gg3,,6
8,hh,hh1,hhh,
,,hh2,,
,,hh3,,
;
run;

 

Thank you!

data_null__
Jade | Level 19

Here you go.

 

data have;
infile datalines truncover dsd;
input Person_ID var1 $ var2 $ var3 $ var4;
datalines;
1,aa,aa1,aaa,
,,aa2,,10
2,bb,bb,bbb,6
3,cc,cc,ccc,8
4,dd,dd,ddd,
5,ee,ee,eee,7
6,ff,ff,fff,
7,gg,gg1,ggg,3
,,gg2,,5
,,gg3,,6
8,hh,hh1,hhh,
,,hh2,,
,,hh3,,
;
run;

data have2;
   retain dummy 1;
   set have;
   run;
proc print; run;
data want;
   if 0 then set have2;
   update have2(keep=dummy obs=0) have2(keep=dummy Person_id var1-var2);
   by dummy;
   set have2(keep=var3-var4);
   output;
   drop dummy;
   run;
proc print;
   run;

Capture.PNG

MayurJadhav
Quartz | Level 8

Here is the solution to retain values on multiple variables and assign the values on the basis of different conditions. The variables may have different or the same conditions.

 

In this example I've used following conditions:  Person_ID eq .   var1 eq''  var2 eq ''

But it can be easily changed and adjusted according to your requirement. @data_null__ 

data have;
	infile datalines truncover dsd;
	input Person_ID var1 $ var2 $;
	datalines;
1,aa,aa1
,,aa2
2,bb,bb
3,cc,cc
4,dd,dd
5,ee,ee
6,ff,ff
7,gg,gg1
,,gg2
,,gg3
8,hh,hh1
,,hh2
,,hh3
;
run;
proc print;

data want;
    set have;
    retain temp1 temp2 temp3;
    if _N_=1 then do;
    	temp1=Person_ID; 
    	temp2=var1; 
    	temp3=var2; 
    	output; 
    	end;
    else do; 
    	/* for Person_ID */
    	if Person_ID eq . then do; 
    		temp1=temp1; 
    		Person_ID=temp1; 
    		end;
    	else do; 
    		temp1=Person_ID; 
    		Person_ID=Person_ID; 
    		end;
     	
     	/* for var1 */
    	if var1 eq '' then do; 
    		temp2=temp2; 
    		var1=temp2; 
    		end;
    	else do; 
    		temp2=var1; 
    		var1=var1; 
    		end;
     	
     	/* for var2 */
    	if var2 eq '' then do; 
    		temp3=temp3; 
    		var2=temp3; 
    		end;
    	else do; 
    		temp3=var2; 
    		var2=var2; 
    		end;
     	
     output;
     temp1=Person_ID;
     temp2=var1;
     temp3=var2;
     
     end;
   drop temp1 temp2 temp3;
 run;
 proc print;
    

MayurJadhav_0-1683757444301.png

 

Mayur Jadhav
BI Developer. Writer. Creative Educator.

SAS Blog → https://learnsascode.com
YouTube Channel: → https://www.youtube.com/@imayurj
MayurJadhav
Quartz | Level 8

This method should work! Looking at your input dataset you just need to add var4 with condition:

if var4 eq . then do

 

It's a bit long code but simple and easy to understand. It would be possible to further optimize this code, possibly using ARRAY or other sas procedures. 

But this method is good enough to achieve desired output.

 

@lichee 

Mayur Jadhav
BI Developer. Writer. Creative Educator.

SAS Blog → https://learnsascode.com
YouTube Channel: → https://www.youtube.com/@imayurj

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 8 replies
  • 784 views
  • 5 likes
  • 3 in conversation