BookmarkSubscribeRSS Feed
bg
Calcite | Level 5 bg
Calcite | Level 5
I have two versions of the same data set..same variable names, Data set A1 has missing values, Data set A2 has missing values filled in. I need to produce a third data set, A3 which has contains only the "filled in" values from data set A2 and missing values in all other spaces. For example I need to produce A3 as follows
Data set A1
X1 X2 X3
1 10 20
. 11 .
3 . 22

Data Set A2
X1 X2 X3
1 10 20
2 11 21
3 12 22

Data Set A3
X1 X2 X3
. , .
2 . 21
. 12 .



Is there a way to do this using Proc SQL or any other ideas? Thanks for any suggestions.

PS I'm using PROC MI to create imputation data sets and I'd like to examine only the values that have been imputed for each variable. In my actual data set I have about 300 variables for which I need to compute original values and only the imputed values.
3 REPLIES 3
ArtC
Rhodochrosite | Level 12
Here is a DATA step that will give you the requested data set. Be sure to examine the diagnostics for MI they can give you good insight into how well the values were imputed.
[pre]Data A1 ;
input X1 X2 X3 ;
datalines;
1 10 20
. 11 .
3 . 22
run;
Data A2;
input X1 X2 X3;
datalines;
1 10 20
2 11 21
3 12 22
run;

data a3(keep=x:);
array yy {9999} _temporary_;
set a2;
array xx {*} x:;
do i = 1 to dim(xx);
yy{i} = xx{i};
end;
set a1;
do i = 1 to dim(xx);
if xx{i}=yy{i} then xx{i}=.;
else xx{i} = yy{i};
end;
run;
proc print data=a3;
run;
[/pre]
This solution makes quite a few assumptions about the accuracy of the problem description. Adjustments to reality can be made using the macro language and other techniques.
Ksharp
Super User
Hi.I also find a method to get that.
And also hope someone can use proc sql to get it.


[pre]
Data A1 ;
input X1 X2 X3 ;
datalines;
1 10 20
. 11 .
3 . 22
run;
Data A2;
input X1 X2 X3;
datalines;
1 10 20
2 11 21
3 12 22
run;

data a1;
set a1;
id+1;*just to success to merge two datasets;
run;
data a2;
set a2;
id+1;*just to success to merge two datasets;
run;
data temp;
set a1;
array x{*} x: ;
do i=1 to dim(x);
if not missing(x{i}) then x{i}=9999;*whatever you want,but not the same with your data value. it is just to flag;
end;
drop i;
run;
data tmp;
update a2 temp;
by id ;
run;
data result;
set tmp(drop=id);
array x{*} x: ;
do i=1 to dim(x);
if x{i}=9999 then call missing(x{i});
end;
drop i;
run;
proc print;
run;


[/pre]



Ksharp Message was edited by: Ksharp
bg
Calcite | Level 5 bg
Calcite | Level 5
Thanks! This is just what I needed. I'll also look into the diagnostics that you mention for Proc MI. Our data set is quite challenging, but if all works out the results will be very useful.

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 3 replies
  • 821 views
  • 0 likes
  • 3 in conversation