BookmarkSubscribeRSS Feed
sas_lak
Quartz | Level 8


Hi I am Sorry for small correction in data.

looking the duplicate data within the same dataset by the different fields

Ex:-

data have;

input name $5.+1 phone $5.;

cards;

AAAAA 12345

AAAAC 12345

AAAAA 09876

BBBBB 12345

CCCCC 99999

DDDDE 88888

DDDDD 88888

DDDDD 55555

EEEEE 44444

FFFFF  11111

;

RUN;

Am looking duplicate phone numbers by Name.(Same name having different Phone numbers)

Ex:-

Name Phone

AAAAA   12345

AAAAA   09876

DDDDE  88888

DDDDD  55555


I think we can use Soundex function to get exact duplicate values by proc sql;

plz any one help me with syntax.

Would Appreciate in Advance.

6 REPLIES 6
Chrishi
Calcite | Level 5

Hope this solves your issue,

proc sql;

create table want as select *,count(name) as count from have group by name having count > 1;

quit;

Sudeer

RW9
Diamond | Level 26 RW9
Diamond | Level 26

Hi,

data have;
attrib Name Phone format=$200.;
name="AAA"; phone="12345"; output;
name="AAA"; phone="67890"; output;
name="BBB"; phone="fssfs"; output;
run;

proc sql;
     create table WANT as
     select     *
     from       HAVE
     group by NAME
     having    count(name)>1;
quit;

sas_lak
Quartz | Level 8

Thnank you for your valuable information

I Think we can use SOUNDEX Function to get exact Duplicates values.

proc sql;

create table dups as select distinct a.*, b.*

from have as a,have as b

where soundex(a.name) eq soundex(b.name) and a.phone ne b.phone

order by a.name

;

quit;

user24feb
Barite | Level 11

less elegant, but if you don´t care to sort and merge:


Proc Sort Data=have;
  By name;
Run;

Proc Freq Data=Have NoPrint;
  Tables name / Out=Have_F (Keep=Count name);
Run;

Data Want (Where=(Count ge 2));
  Merge Have (in=inH) Have_F (in=inF);
  By Name;
  If inH;
Run;

Chrishi
Calcite | Level 5

proc sql;

create table want as select * from have group by name having count > 1;

quit;

proc sort data=want out=dup nodupkey;

by name phone;

run;

Sudeer

Ksharp
Super User

Still not full sure what you try to get.

data have;
input name $5.+1 phone $5.;
cards;
AAAAA 12345
AAAAC 12345
AAAAA 09876
BBBBB 12345
CCCCC 99999
DDDDE 88888
DDDDD 88888
DDDDD 55555
EEEEE 44444
FFFFF  11111
;
RUN;
proc sql;
select * from have group by name having count(distinct phone) ge 2;
quit;

Xia Keshan

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 6 replies
  • 2371 views
  • 6 likes
  • 5 in conversation