Hi SAS users,
I have two columns with the same type character values. I just want to check if there is any duplicate between thease two columns (when I say duplicate, I don't mean duplicate at each row; I mean any repeated value in both columns regardless of the order.) Could you please help with it?
Thank you
Hi @almmotamedi,
Thanks for the clarification. If you want to create a new dataset (call it WANT) with one variable, New_column1, from the existing dataset (call it HAVE) with the two variables, you could use the EXCEPT operator in PROC SQL:
proc sql;
create table want as
select column1 as New_column1 from have
except
select column2 from have;
quit;
This assumes that, say, "B" should not occur in New_column1 even if it occurred twice in column1, but only once in column2. If, in the latter case, you would like to take into account the "multiplicities", i.e. eliminate only one of the two occurrences of "B", just replace except by except all in the code above.
What do you want the output to be?
Its simply like this:
column1 column2
A B
B C
C
D
I would like to create the New_column1 as below:
New_column1
A
D
(Because, A and D are not repeated/duplicate in the 2nd column.)
Hi @almmotamedi,
Thanks for the clarification. If you want to create a new dataset (call it WANT) with one variable, New_column1, from the existing dataset (call it HAVE) with the two variables, you could use the EXCEPT operator in PROC SQL:
proc sql;
create table want as
select column1 as New_column1 from have
except
select column2 from have;
quit;
This assumes that, say, "B" should not occur in New_column1 even if it occurred twice in column1, but only once in column2. If, in the latter case, you would like to take into account the "multiplicities", i.e. eliminate only one of the two occurrences of "B", just replace except by except all in the code above.
Hi @almmotamedi,
Do you mean something like this?
data have;
length var1 var2 $10;
input var1 var2;
cards;
blue red
green yellow
red brown
black green
;
proc sql;
create table dups as
select a.var1
from have a, have b
where a.var1=b.var2;
quit;
proc print noobs;
run;
Result:
var1 green red
You may need to
select distinct a.var1
if the value could be repeated in each column more than one time. Or you will end up by long list.
It looks like you need some QUERY tool. MERGE, Hash Table, SQL , IML ,Array ..... all can do that.
Here is the IML code :
data have;
length var1 var2 $10;
input var1 var2;
cards;
blue red
green yellow
red brown
black green
;
run;
proc iml;
use have;
read all var {var1 var2};
close have;
x=var1[loc(element(var1,var2))];
print x;
quit;
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.