Data Analysis-Matches/Non Matches look up logic?

Accepted Solution Solved
Reply
Frequent Contributor
Posts: 137
Accepted Solution

Data Analysis-Matches/Non Matches look up logic?

Hi SAS Folks,

I have a PARENT DATASET with the following fields:

ID    Name    Location

1    Mark     New york

2    Stephen    London

3    Keith    Portland

4    Lee    Austin

5    Jim    Dallas

6    Jack    Houston

7    Kay    Salem

8    Andrew    Chicago

9    Josh    stpaul

10    Raymond    Paris

Using some Business logic, the parent dataset has been validated based on ID, Name and Location and have derived with 3 datasets like below:

ID_Validated_ table                                

ID    Name Location

1    Mark     New York

2    Stephen     London

3    Keith     Portland

4    Lee     Austin

5    Jim     Dallas

6    Jack    Houston

Name_Validated_Table

ID    Name Location

4    Lee    Austin

5    Jim    Dallas

6    Jack    Houston

7    Kay    Salem

Location_Validated_Table

ID Name Location

1    Mark     New york

3    Keith    Portland

5    Jim    Dallas

7    Kay    Salem

9    Josh    stpaul

I used a datastep merge to find fully validated record from the above three and got: "5    Jim    Dallas" as the only full validated records. My objective is find out invalidated records(exceptions) and add a variable like

reason_for_invalid: /*with values such as ID, Location, Name.

Your help would be most appreciated,

Charlotte

PS SAS-L folks might have noticed the same question there as well. Please accept my apology if you are annoyed to see this here as well


Accepted Solutions
Solution
‎06-24-2015 10:00 AM
Super User
Posts: 9,681

Re: Data Analysis-Matches/Non Matches look up logic?

Charlotte,

How about :

Code: Program

data have;
input ID   Name   $ Location & $20.;
cards;
1 Mark New york
2 Stephen London
3 Keith Portland
4 Lee Austin
5 Jim Dallas
6 Jack Houston
7 Kay Salem
8 Andrew Chicago
9 Josh stpaul
10 Raymond Paris
;
data ID_Validated_table;  
input ID   Name   $ Location & $20.;
cards;
1 Mark New York
2 Stephen London
3 Keith Portland
4 Lee Austin
5 Jim Dallas
6 Jack Houston
;
data Name_Validated_Table;
input ID   Name   $ Location & $20.;
cards;
4 Lee Austin
5 Jim Dallas
6 Jack Houston
7 Kay Salem
;
data Location_Validated_Table;
input ID   Name   $ Location & $20.;
cards;
1 Mark New york
3 Keith Portland
5 Jim Dallas
7 Kay Salem
9 Josh stpaul
;
run;
data want;
if _n_ eq 1 then do;
  if 0 then set ID_Validated_table(keep=ID ) ;
  declare hash h_id(dataset:'ID_Validated_table(keep=ID )');
  h_id.definekey('ID');
  h_id.definedone();
 
  if 0 then set Name_Validated_Table(keep=Name ) ;
  declare hash h_name(dataset:'Name_Validated_Table(keep=Name )');
  h_name.definekey('Name');
  h_name.definedone();

  if 0 then set Location_Validated_Table(keep=Location) ;
  declare hash h_location(dataset:'Location_Validated_Table(keep=Location)');
  h_location.definekey('Location');
  h_location.definedone();
end;
set have;
length reason_for_invalid $ 40;
if h_id.check() ne 0 then reason_for_invalid=catx(' ',reason_for_invalid,'ID');
if h_name.check() ne 0 then reason_for_invalid=catx(' ',reason_for_invalid,'Name');
if h_location.check() ne 0 then reason_for_invalid=catx(' ',reason_for_invalid,'Location');
run;

Xia Keshan

View solution in original post


All Replies
Valued Guide
Posts: 858

Re: Data Analysis-Matches/Non Matches look up logic?

I wrote out a label for each option, not sure if this is something you'd be interested but here's an idea:

data have1;

infile cards dsd;

input ID$ name$ location$;

cards;

1,Mark,New York

2,Stephen,London

3,Keith,Portland

4,Lee,Austin

5,Jim,Dallas

6,Jack,Houston

;

data have2;

infile cards dsd;

input ID$ name$ location$;

cards;

4,Lee,Austin

5,Jim,Dallas

6,Jack,Houston

7,Kay,Salem

;

data have3;

infile cards dsd;

input ID$ name$ location$;

cards;

1,Mark,New York

3,Keith,Portland

5,Jim,Dallas

7,Kay,Salem

9,Josh,stpaul

;

data want;

merge have1(in=a)

      have2(in=b)

      have3(in=c);

by id name location;

if a and not b and not c then Flag = 'A Only';

if not a and b and c then Flag = 'Not A';

if b and not a and not c then Flag = 'B Only';

if a and not b and c then Flag = 'Not B';

if c and not a and not b then Flag = 'C Only';

if a and b and not c then Flag = 'Not C';

if a and b and c then Flag = 'Valid';

run;

Frequent Contributor
Posts: 137

Re: Data Analysis-Matches/Non Matches look up logic?

Good morning Sir, That's close and I appreciate the smart label idea. The output looks neat labelling the invalidated ones among the 3 validated ones. I was also trying the same too but you accomplished it. However, when we look back at the PARENT_DATASET, on first validation(Using some Business logic, the parent dataset has been validated based on ID, Name and Location) the following observations would be in their respective 3 invalid datasets as did the valid ones were written to the 3 valid datasets respectively that you used for merge.

*ID_Invalidated_Table;

7    Kay    Salem

8    Andrew    Chicago

9    Josh    stpaul

10    Raymond    Paris

*Name_Invalidated_Table;

1    Mark     New york

2    Stephen    London

3    Keith    Portland

8    Andrew    Chicago

9    Josh    stpaul

10    Raymond    Paris

*Location_Invalidated_Table;

2    Stephen    London

4    Lee    Austin

6    Jack    Houston

8    Andrew    Chicago

10    Raymond    Paris

So, how do we  logically combine the above 3  the  invalid ones that we got from the smart datastep merge you did to create one exception report with the right label?

Thank you,

Charlotte

Valued Guide
Posts: 858

Re: Data Analysis-Matches/Non Matches look up logic?

Are you saying you want the three datasets derived from the first? 

Super User
Posts: 17,829

Re: Data Analysis-Matches/Non Matches look up logic?

Do you need to bring across the name/location from each table as well?

If not I would add in some flags for validated and then update using ID.

data want;

set parent;

valid_ID=0;

Valid_name=0;

Valid_location=0;

run;

proc sql;

update want

set valid_id=1

where id in (select id from id_valid);

quit;

And repeat for each validation...

Solution
‎06-24-2015 10:00 AM
Super User
Posts: 9,681

Re: Data Analysis-Matches/Non Matches look up logic?

Charlotte,

How about :

Code: Program

data have;
input ID   Name   $ Location & $20.;
cards;
1 Mark New york
2 Stephen London
3 Keith Portland
4 Lee Austin
5 Jim Dallas
6 Jack Houston
7 Kay Salem
8 Andrew Chicago
9 Josh stpaul
10 Raymond Paris
;
data ID_Validated_table;  
input ID   Name   $ Location & $20.;
cards;
1 Mark New York
2 Stephen London
3 Keith Portland
4 Lee Austin
5 Jim Dallas
6 Jack Houston
;
data Name_Validated_Table;
input ID   Name   $ Location & $20.;
cards;
4 Lee Austin
5 Jim Dallas
6 Jack Houston
7 Kay Salem
;
data Location_Validated_Table;
input ID   Name   $ Location & $20.;
cards;
1 Mark New york
3 Keith Portland
5 Jim Dallas
7 Kay Salem
9 Josh stpaul
;
run;
data want;
if _n_ eq 1 then do;
  if 0 then set ID_Validated_table(keep=ID ) ;
  declare hash h_id(dataset:'ID_Validated_table(keep=ID )');
  h_id.definekey('ID');
  h_id.definedone();
 
  if 0 then set Name_Validated_Table(keep=Name ) ;
  declare hash h_name(dataset:'Name_Validated_Table(keep=Name )');
  h_name.definekey('Name');
  h_name.definedone();

  if 0 then set Location_Validated_Table(keep=Location) ;
  declare hash h_location(dataset:'Location_Validated_Table(keep=Location)');
  h_location.definekey('Location');
  h_location.definedone();
end;
set have;
length reason_for_invalid $ 40;
if h_id.check() ne 0 then reason_for_invalid=catx(' ',reason_for_invalid,'ID');
if h_name.check() ne 0 then reason_for_invalid=catx(' ',reason_for_invalid,'Name');
if h_location.check() ne 0 then reason_for_invalid=catx(' ',reason_for_invalid,'Location');
run;

Xia Keshan

Frequent Contributor
Posts: 137

Re: Data Analysis-Matches/Non Matches look up logic?

Hi   , Thank you yet again as always for helping me out. I was gonna write to thank you personally, however the reason for the late acknowledgement is that I think I made a mistake in understanding the requirement correctly. So, I am still trying to understand the requirement properly  and might reach out to you again for help if needed. Take Care!-

Charlotte

☑ This topic is SOLVED.

Need further help from the community? Please ask a new question.

Discussion stats
  • 6 replies
  • 374 views
  • 0 likes
  • 4 in conversation