Hi
I need to generate a mismatch report by comparing three variable values (one set) from one dataset with 2 sets of variable values in a different dataset.
For example Dataset one has var1 var2 var3 etc. and dataset two has var1 var2 var3 var4 var5 var6 etc.
I need to check whether (one.var1 one.var2 and one. var3 )matches with( two. var1,two.var2 two.var3) or (two. var4, two.var5, two.var6) and generate a mismatch report having only mismatches(fallouts). of one.var1 onevar2 and one var3
I was wondering what could be the best way(efficient way) to program this scenaro.
Thanks in advance!
NAV@ ASTOUNDING,
Thank you !
Simple and smart solution.
Seems I am thinking too much and making it more complicated .
I have tried and compared with my other program output where I have used Except operator.
GOT the same results as your program .
Thank you very much everyone for all your valuable time and suggestions .
KIND REGARDS
NAV
How many records in each dataset and is there an ID variable or, if not, are both datasets' records in the same order?
Art, CEO, AnalystFinder.com
Depending upon your answer, and whether the report has to be detailed (and, if so, to what level of specificity), you might be able to get by with something as simple as:
data one; input var1-var3; cards; 1 1 1 2 1 2 1 2 1 ; data two; input var1-var6; cards; 1 1 1 1 1 1 2 1 2 2 2 2 1 2 1 1 2 2 ; data both (keep=var1-var3 match); set one; set two (rename=(var1=_var1 var2=_var2 var3=_var3)); array one(3) var1-var3; array twoa(3) _var1-_var3; array twob(3) var4-var6; do i=1 to 3; if one(i) ne twoa(i) or one(i) ne twob(i) then match='mismatch'; end; run;
Art, CEO, AnalystFinder.com
Hi ,
Thanks art297!!
There are no ID variables In my dataset and order is also not same.
What I need to check is whether the value of ( one.var1 ||one. var2||one.var3) eq (two.var1||two.var2||two.var3) OR
( one.var1 ||one. var2||one.var3) eq ( two.var4 ||two. var5||two.var6) .
Right now I have 150 obs in data one and data two is master dataset which has 500 obs .
Note: one.var3 two.var3 two.var6 might have missing values .
I have used proc sql with except operator but not sure whether my approach is right.
Thanks
NAV
The COMPARE procedure can also do this. I reckon it is one of SAS's best kept "secrets".
To insure we're all on the same page, consider the following two datasets:
data one; input var1-var3; cards; 1 1 1 2 1 2 1 2 1 ; data two; input var1-var6; cards; 1 1 1 1 1 1 2 1 2 2 2 2 1 2 1 1 2 2 1 1 1 1 1 1 2 1 2 2 4 2 1 2 1 1 2 2 1 1 1 1 1 1 2 1 2 2 2 1 1 2 1 1 2 3 ;
Given those two data sets, show us what you want your output to look like.
Art, CEO, AnalystFinder.com
Here's a simple way:
data checklist (keep=var1-var3);
set two (keep=var1-var6);
output;
var1=var4;
var2=var5;
var3=var6;
output;
run;
proc sort data=checklist nodupkey;
by var1 var2 var3;
run;
proc sort data=one;
by var1 var2 var3;
run;
data mismatch;
merge one checklist (in=on_the_master);
by var1 var2 var3;
if on_the_master=0;
run;
Your data sets are small enough that "easiest to follow" is more important than "speediest to run".
@Nav: One aspect of your approach surprizes me as, earlier, you said that dataset one was the smaller of the two. Is it the one that has the desired entries and are their duplicates and/or conflicts within its entries?
Also, is case relevant or irrelevant?
Art, CEO, AnalystFinder.com
Hi @art297
Yes Dataset one is smaller .In the example I have provided I didn't create lots of observations for dataset two, however Dataset two has lots of observations .
Each observation of dataset one( var1 var2 var3) should be compared with two sets in data two and if there is mismatch then
only mismatched observations of dataset one should be in the mismatch dataset.
Yes it is Case relevant !!
Thank you very much!!
NAV@ ASTOUNDING,
Thank you !
Simple and smart solution.
Seems I am thinking too much and making it more complicated .
I have tried and compared with my other program output where I have used Except operator.
GOT the same results as your program .
Thank you very much everyone for all your valuable time and suggestions .
KIND REGARDS
NAV
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.