DATA Step, Macro, Functions and more

How to compare values in two columns regardless of the row orders?

Accepted Solution Solved
Reply
Contributor
Posts: 72
Accepted Solution

How to compare values in two columns regardless of the row orders?

[ Edited ]

Hi SAS users,

 

I have two columns with the same type character values. I just want to check if there is any duplicate between thease two columns (when I say duplicate, I don't mean duplicate at each row; I mean any repeated value in both columns regardless of the order.) Could you please help with it?

 

Thank you


Accepted Solutions
Solution
‎02-23-2016 09:06 AM
Trusted Advisor
Posts: 1,117

Re: How to compare values in two columns regardless of the row orders?

Posted in reply to almmotamedi

Hi @almmotamedi,

 

Thanks for the clarification. If you want to create a new dataset (call it WANT) with one variable, New_column1, from the existing dataset (call it HAVE) with the two variables, you could use the EXCEPT operator in PROC SQL:

proc sql;
create table want as
select column1 as New_column1 from have
except
select column2 from have;
quit;

This assumes that, say, "B" should not occur in New_column1 even if it occurred twice in column1, but only once in column2. If, in the latter case, you would like to take into account the "multiplicities", i.e. eliminate only one of the two occurrences of "B", just replace except by except all in the code above.

View solution in original post


All Replies
Super User
Posts: 19,768

Re: How to compare values in two columns regardless of the row orders?

Posted in reply to almmotamedi

What do you want the output to be?

Contributor
Posts: 72

Re: How to compare values in two columns regardless of the row orders?

[ Edited ]

Its simply like this:

 

column1    column2

    A                 B

    B                 C

    C

    D

 

 

I would like to create the New_column1 as below:

 

New_column1

        A

        D

  

(Because, A and D are not repeated/duplicate in the 2nd column.)

Solution
‎02-23-2016 09:06 AM
Trusted Advisor
Posts: 1,117

Re: How to compare values in two columns regardless of the row orders?

Posted in reply to almmotamedi

Hi @almmotamedi,

 

Thanks for the clarification. If you want to create a new dataset (call it WANT) with one variable, New_column1, from the existing dataset (call it HAVE) with the two variables, you could use the EXCEPT operator in PROC SQL:

proc sql;
create table want as
select column1 as New_column1 from have
except
select column2 from have;
quit;

This assumes that, say, "B" should not occur in New_column1 even if it occurred twice in column1, but only once in column2. If, in the latter case, you would like to take into account the "multiplicities", i.e. eliminate only one of the two occurrences of "B", just replace except by except all in the code above.

Contributor
Posts: 72

Re: How to compare values in two columns regardless of the row orders?

Posted in reply to FreelanceReinhard
Thank you so much
Trusted Advisor
Posts: 1,117

Re: How to compare values in two columns regardless of the row orders?

[ Edited ]
Posted in reply to almmotamedi

Hi @almmotamedi,

 

Do you mean something like this?

data have;
length var1 var2 $10;
input var1 var2;
cards;
blue   red
green  yellow
red    brown
black  green
;

proc sql;
create table dups as
select a.var1
from have a, have b
where a.var1=b.var2;
quit;

proc print noobs;
run;

Result:

var1

green
red

 

Super Contributor
Posts: 490

Re: How to compare values in two columns regardless of the row orders?

Posted in reply to FreelanceReinhard

You may need to 

select distinct a.var1

if the value could be repeated in each column more than one time. Or you will end up by long list.

Super User
Posts: 10,018

Re: How to compare values in two columns regardless of the row orders?

Posted in reply to almmotamedi

It looks like you need some QUERY tool. MERGE, Hash Table, SQL , IML ,Array ..... all can do that.

Here is the IML code :

 

 

data have;
length var1 var2 $10;
input var1 var2;
cards;
blue   red
green  yellow
red    brown
black  green
;
run;
proc iml;
use have;
read all var {var1 var2};
close have;
x=var1[loc(element(var1,var2))];
print x;
quit;
☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 7 replies
  • 2663 views
  • 1 like
  • 5 in conversation