mismatch report with 1 set of 3 variables from one dataset with two sets in other dataset

Accepted Solution Solved
Reply
Occasional Contributor Nav
Occasional Contributor
Posts: 6
Accepted Solution

mismatch report with 1 set of 3 variables from one dataset with two sets in other dataset

Hi

 

I  need to generate a mismatch report by comparing  three variable values (one set) from one dataset with  2 sets  of variable values in a different dataset.

 

For example Dataset one has var1 var2 var3 etc. and dataset two has  var1 var2 var3  var4 var5 var6 etc.

 

I need to check whether  (one.var1 one.var2 and one. var3 )matches with( two. var1,two.var2 two.var3) or (two. var4, two.var5, two.var6) and generate a mismatch report having only mismatches(fallouts). of one.var1 onevar2 and one var3

 

I was wondering what could be the best way(efficient way) to program this scenaro.

 

Thanks in advance!

 

 

 


Accepted Solutions
Solution
‎03-19-2017 10:47 PM
Occasional Contributor Nav
Occasional Contributor
Posts: 6

Re: mismatch report with 1 set of 3 variables from one dataset with two sets in other dataset

[ Edited ]

NAV@ ASTOUNDING,

 

Thank you !  

 

  Simple and smart solution. 

  Seems I am thinking too much and making it more complicated . 

 

I have tried and compared with my other program output where I have used Except operator.

GOT the same results as your program . 

 

Thank you very much everyone  for all your valuable time and  suggestions .

  

   

 

KIND REGARDS

NAV

View solution in original post


All Replies
PROC Star
Posts: 7,433

Re: mismatch report with 1 set of 3 variables from one dataset with two sets in other dataset

How many records in each dataset and is there an ID variable or, if not, are both datasets' records in the same order?

 

Art, CEO, AnalystFinder.com

PROC Star
Posts: 7,433

Re: mismatch report with 1 set of 3 variables from one dataset with two sets in other dataset

Depending upon your answer, and whether the report has to be detailed (and, if so, to what level of specificity), you might be able to get by with something as simple as:

 

data one;
  input var1-var3;
  cards;
1 1 1
2 1 2
1 2 1
;
data two;
  input var1-var6;
  cards;
1 1 1 1 1 1
2 1 2 2 2 2
1 2 1 1 2 2
;

data both (keep=var1-var3 match);
  set one;
  set two (rename=(var1=_var1 var2=_var2 var3=_var3));
  array one(3) var1-var3;
  array twoa(3) _var1-_var3;
  array twob(3)  var4-var6;
  do i=1 to 3;
    if one(i) ne twoa(i) or one(i) ne twob(i) then match='mismatch';
  end;
run;

Art, CEO, AnalystFinder.com

 

Occasional Contributor Nav
Occasional Contributor
Posts: 6

Re: mismatch report with 1 set of 3 variables from one dataset with two sets in other dataset

Thank you very much!!
Will try this and let you know. I need only mismatches of dataset one. with var1 var2 and var3 .along with comments column where I need to specify which variable/variables not found in dataset two .
Occasional Contributor Nav
Occasional Contributor
Posts: 6

Re: mismatch report with 1 set of 3 variables from one dataset with two sets in other dataset

Hi ,

 

Thanks art297!!

There are no ID variables In my dataset  and order is also not same.

 

What I need to check is whether the value of ( one.var1 ||one. var2||one.var3) eq (two.var1||two.var2||two.var3) OR

( one.var1 ||one. var2||one.var3) eq ( two.var4 ||two. var5||two.var6) .

Right now I have 150 obs in data one and  data two is master dataset which has 500 obs .

Note: one.var3 two.var3 two.var6  might have missing values .

I have used proc sql with except operator but not sure whether my approach is right.

 

Thanks

NAV

 

 

 

 

 

 

Super User
Posts: 3,233

Re: mismatch report with 1 set of 3 variables from one dataset with two sets in other dataset

The COMPARE procedure can also do this. I reckon it is one of SAS's best kept "secrets".

 

http://support.sas.com/documentation/cdl/en/proc/70377/HTML/default/viewer.htm#n0c1y14wyd3u7yn1dmfcp...

 

 

PROC Star
Posts: 7,433

Re: mismatch report with 1 set of 3 variables from one dataset with two sets in other dataset

To insure we're all on the same page, consider the following two datasets:

data one;
  input var1-var3;
  cards;
1 1 1
2 1 2
1 2 1
;
data two;
  input var1-var6;
  cards;
1 1 1 1 1 1
2 1 2 2 2 2
1 2 1 1 2 2
1 1 1 1 1 1
2 1 2 2 4 2
1 2 1 1 2 2
1 1 1 1 1 1
2 1 2 2 2 1
1 2 1 1 2 3
;

Given those two data sets, show us what you want your output to look like.

 

Art, CEO, AnalystFinder.com

 

Occasional Contributor Nav
Occasional Contributor
Posts: 6

Re: mismatch report with 1 set of 3 variables from one dataset with two sets in other dataset

[ Edited ]

Hi , 

 Sorry @art297 I should have given  dummy data  to avoid confusion.

 

Super User
Posts: 5,363

Re: mismatch report with 1 set of 3 variables from one dataset with two sets in other dataset

[ Edited ]

Here's a simple way:

 

data checklist (keep=var1-var3);

set two (keep=var1-var6);

output;

var1=var4;

var2=var5;

var3=var6;

output;

run;

 

proc sort data=checklist nodupkey;

by var1 var2 var3;

run;

 

proc sort data=one;

by var1 var2 var3;

run;

 

data mismatch;

merge one checklist (in=on_the_master);

by var1 var2 var3;

if on_the_master=0;

run;

 

Your data sets are small enough that "easiest to follow" is more important than "speediest to run".

PROC Star
Posts: 7,433

Re: mismatch report with 1 set of 3 variables from one dataset with two sets in other dataset

@Nav: One aspect of your approach surprizes me as, earlier, you said that dataset one was the smaller of the two. Is it the one that has the desired entries and are their duplicates and/or conflicts within its entries?

 

Also, is case relevant or irrelevant?

 

Art, CEO, AnalystFinder.com

 

Occasional Contributor Nav
Occasional Contributor
Posts: 6

Re: mismatch report with 1 set of 3 variables from one dataset with two sets in other dataset

[ Edited ]

Hi @art297

 

Yes Dataset one is smaller  .In the example I have provided I didn't create lots of observations for dataset two, however  Dataset two has lots of observations .

Each observation of dataset one( var1 var2 var3) should be compared with two sets in data two  and if there is mismatch then 

only mismatched observations of dataset one should be in  the  mismatch dataset. 

 

Yes  it is Case relevant !!

 

Thank you very much!!

 

 

 

 

Solution
‎03-19-2017 10:47 PM
Occasional Contributor Nav
Occasional Contributor
Posts: 6

Re: mismatch report with 1 set of 3 variables from one dataset with two sets in other dataset

[ Edited ]

NAV@ ASTOUNDING,

 

Thank you !  

 

  Simple and smart solution. 

  Seems I am thinking too much and making it more complicated . 

 

I have tried and compared with my other program output where I have used Except operator.

GOT the same results as your program . 

 

Thank you very much everyone  for all your valuable time and  suggestions .

  

   

 

KIND REGARDS

NAV

☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 11 replies
  • 384 views
  • 0 likes
  • 4 in conversation