BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
almmotamedi
Obsidian | Level 7

Hi SAS users,

 

I have two columns with the same type character values. I just want to check if there is any duplicate between thease two columns (when I say duplicate, I don't mean duplicate at each row; I mean any repeated value in both columns regardless of the order.) Could you please help with it?

 

Thank you

1 ACCEPTED SOLUTION

Accepted Solutions
FreelanceReinh
Jade | Level 19

Hi @almmotamedi,

 

Thanks for the clarification. If you want to create a new dataset (call it WANT) with one variable, New_column1, from the existing dataset (call it HAVE) with the two variables, you could use the EXCEPT operator in PROC SQL:

proc sql;
create table want as
select column1 as New_column1 from have
except
select column2 from have;
quit;

This assumes that, say, "B" should not occur in New_column1 even if it occurred twice in column1, but only once in column2. If, in the latter case, you would like to take into account the "multiplicities", i.e. eliminate only one of the two occurrences of "B", just replace except by except all in the code above.

View solution in original post

7 REPLIES 7
Reeza
Super User

What do you want the output to be?

almmotamedi
Obsidian | Level 7

Its simply like this:

 

column1    column2

    A                 B

    B                 C

    C

    D

 

 

I would like to create the New_column1 as below:

 

New_column1

        A

        D

  

(Because, A and D are not repeated/duplicate in the 2nd column.)

FreelanceReinh
Jade | Level 19

Hi @almmotamedi,

 

Thanks for the clarification. If you want to create a new dataset (call it WANT) with one variable, New_column1, from the existing dataset (call it HAVE) with the two variables, you could use the EXCEPT operator in PROC SQL:

proc sql;
create table want as
select column1 as New_column1 from have
except
select column2 from have;
quit;

This assumes that, say, "B" should not occur in New_column1 even if it occurred twice in column1, but only once in column2. If, in the latter case, you would like to take into account the "multiplicities", i.e. eliminate only one of the two occurrences of "B", just replace except by except all in the code above.

FreelanceReinh
Jade | Level 19

Hi @almmotamedi,

 

Do you mean something like this?

data have;
length var1 var2 $10;
input var1 var2;
cards;
blue   red
green  yellow
red    brown
black  green
;

proc sql;
create table dups as
select a.var1
from have a, have b
where a.var1=b.var2;
quit;

proc print noobs;
run;

Result:

var1

green
red

 

mohamed_zaki
Barite | Level 11

You may need to 

select distinct a.var1

if the value could be repeated in each column more than one time. Or you will end up by long list.

Ksharp
Super User

It looks like you need some QUERY tool. MERGE, Hash Table, SQL , IML ,Array ..... all can do that.

Here is the IML code :

 

 

data have;
length var1 var2 $10;
input var1 var2;
cards;
blue   red
green  yellow
red    brown
black  green
;
run;
proc iml;
use have;
read all var {var1 var2};
close have;
x=var1[loc(element(var1,var2))];
print x;
quit;

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 7 replies
  • 15672 views
  • 1 like
  • 5 in conversation