BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Abin
Calcite | Level 5

Hi,

 

I would like to find the exact mismatch and missing values from the below Tables A and B by using Proc SQL procedure, Can anyone help me please?

 

Table A

CodeTestValue
1001ABCDGood
1001ABCDBad
1001ABCDAbove average
1001ABCDAverage

 

Table B

CodeTestValue1
1001ABCDBad
1001ABCDGood
1001ABCD 
1001ABCDBelow average

 

Expected output:

 

CodeTestValueValue1
1001ABCDAbove averageBelow average
1001ABCDAverage 

 

Thanks in advance!

 

Thanks,

Abin

1 ACCEPTED SOLUTION

Accepted Solutions
hbi
Quartz | Level 8 hbi
Quartz | Level 8

Here's the output for each of the four PROC SQL statements:

 

Table_A_Not_B

Obs

Code

Test

Value

seq_num

1

1001

ABCD

Above_average

1

2

1001

ABCD

Average

2

 

Table_B_Not_A

Obs

Code

Test

Value1

seq_num

1

1001

ABCD

Below_average

1

 

Want_Side_by_Side

Obs

Code

Test

Value

Value1

seq_num

1

1001

ABCD

Above_average

Below_average

1

2

1001

ABCD

Average

 

2

 

Want_Stacked

Obs

Code

Test

Value

Row_Source

1

1001

ABCD

Above_average

Table_A_Not_B

2

1001

ABCD

Average

Table_A_Not_B

3

1001

ABCD

Below_average

Table_B_Not_A

 

View solution in original post

6 REPLIES 6
ballardw
Super User

How are we supposed to know that "Below Average" is the mismatch for "Above average" and not Average?

That this is NOT the desired result?

 

Code Test Value Value1
1001 ABCD Above average  
1001 ABCD Average  Below average

 

You might want to provide a bit more data and possbily with more Code and Test values as currently we don't know if there are any and solutions if ALL of Code = 1001 and All of Test=ABCD may not work if there are others involved.

Abin
Calcite | Level 5

Thank you Ballardw!

 

Yes, its also possible!

 

But i don't want to see somehitng like

 

CodeTestValueValue1
1001ABCDgoodBad
1001ABCDBadgood
PGStats
Opal | Level 21

I suggest to list the mismatches in the following way:

 


title "Unmatched values";
proc sql;
select "A" as from, a.Code, a.Test, a.Value
from a left join b 
    on a.code=b.code and a.test=b.test and a.value=b.value
where b.value is missing
union all
select "B" as from, b.Code, b.Test, b.Value
from a right join b 
    on a.code=b.code and a.test=b.test and a.value=b.value
where a.value is missing
order by code, test, from
;
quit;
PG
hbi
Quartz | Level 8 hbi
Quartz | Level 8

This could be done in fewer steps, but I used multiple PROC SQL statements to show what is done at each step to arrive at the final results.

 

[Edited: you may want to change "EXCEPT ALL" to "EXCEPT". If your data contains multiple rows with identical values (e.g. three observations in Table_A that all have "1001 / ABCD / Good"), "EXCEPT" should give you better results.]

 

DATA Table_A;
  LENGTH Code 8 Test $10 Value $20;
  INPUT Code Test $ Value $;
  DATALINES;
  1001 ABCD Good
  1001 ABCD Bad
  1001 ABCD Above_average
  1001 ABCD Average
;
RUN;
 
DATA Table_B;
  LENGTH Code 8 Test $10 Value1 $20;
  INPUT Code Test $ Value1 $;
  DATALINES;
  1001 ABCD Bad
  1001 ABCD Good
  1001 ABCD .
  1001 ABCD Below_average
;
RUN;


PROC SQL;
  CREATE TABLE Table_A_Not_B AS 
  SELECT Qry.*
       , monotonic() AS seq_num
  FROM (SELECT * FROM Table_A(WHERE=(Value IS NOT NULL))
  /* Change "EXCEPT ALL" to "EXCEPT" */
  EXCEPT
  SELECT * FROM Table_B(WHERE=(Value1 IS NOT NULL))) AS Qry;
QUIT;


PROC SQL;
  CREATE TABLE Table_B_Not_A AS 
  SELECT Qry.*
       , monotonic() AS seq_num
  FROM (SELECT * FROM Table_B(WHERE=(Value1 IS NOT NULL))
  /* Change "EXCEPT ALL" to "EXCEPT" */
  EXCEPT
  SELECT * FROM Table_A(WHERE=(Value IS NOT NULL))) AS Qry;
QUIT;


/* this would give you the results exactly the way you want (side-by-side) */
PROC SQL;
  CREATE TABLE Want_Side_by_Side AS 
  SELECT COALESCE(T1.Code, T2.Code) AS Code
       , COALESCE(T1.Test, T2.Test) AS Test
       , T1.Value
       , T2.Value1
       , COALESCE(T1.seq_num, T2.seq_num) AS seq_num
  FROM Table_A_Not_B AS T1
  FULL OUTER JOIN Table_B_Not_A AS T2
    ON  (T1.Code        = T2.Code
         AND T1.Test    = T2.Test
         AND T1.seq_num = T2.seq_num);
QUIT;


/* alternate version: this would give you the results vertically (stacked); 
   this approach would be more traditional in SQL because it avoids assigning 
   and match/mismatch preference (Ballardw had pointed this out above) */
PROC SQL;
  CREATE TABLE Want_Stacked(drop=seq_num) AS 
  SELECT T1.*, 'Table_A_Not_B' AS Row_Source FROM Table_A_Not_B AS T1
  UNION 
  SELECT T2.*, 'Table_B_Not_A' AS Row_Source FROM Table_B_Not_A AS T2;
QUIT;

 

Hope this helps.

 

-- hbi

 

hbi
Quartz | Level 8 hbi
Quartz | Level 8

Here's the output for each of the four PROC SQL statements:

 

Table_A_Not_B

Obs

Code

Test

Value

seq_num

1

1001

ABCD

Above_average

1

2

1001

ABCD

Average

2

 

Table_B_Not_A

Obs

Code

Test

Value1

seq_num

1

1001

ABCD

Below_average

1

 

Want_Side_by_Side

Obs

Code

Test

Value

Value1

seq_num

1

1001

ABCD

Above_average

Below_average

1

2

1001

ABCD

Average

 

2

 

Want_Stacked

Obs

Code

Test

Value

Row_Source

1

1001

ABCD

Above_average

Table_A_Not_B

2

1001

ABCD

Average

Table_A_Not_B

3

1001

ABCD

Below_average

Table_B_Not_A

 

Steelers_In_DC
Barite | Level 11

proc sql;
create table want as
select distinct a.code,a.test,a.value,b.value1
from table_a a left join
     table_b b on
a.code = a.code and
b.test = b.test
where value not in (select b.value1 from table_b b) and
      value1 not in (select a.value from table_a a);

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 6 replies
  • 2658 views
  • 5 likes
  • 5 in conversation