DATA Step, Macro, Functions and more

Is there any way to delete duplicate records ina dataset

Reply
N/A
Posts: 0

Is there any way to delete duplicate records ina dataset

Hi,

Is there any way to delete exact duplicate records and write out only one recor from dulplicated set?

Say for example my infile has a set of exact 5 duplicate records and I want to delete other 4 and just write out 1 record.

thanks,
sasbase9
Super Contributor
Super Contributor
Posts: 3,174

Re: Is there any way to delete duplicate records ina dataset

Explore PROC SORT and DUPOUT= option.

Scott Barry
SBBWorks, Inc.
Contributor
Posts: 36

Re: Is there any way to delete duplicate records ina dataset

Proc Sort with noduplicates option works.

proc sort data=dsname out=sorted noduplicates;
by var1 var2 ...;
run;

The noduplicates option removes records that are exactly the same in every variable.
The noidupkey option removes records where the by variables are the same.

Hope this helps.
N/A
Posts: 0

Re: Is there any way to delete duplicate records ina dataset

proc sort data=x nodups dupsout=dup;
by id;
run;

Now the duplicate obs move to Dup dataset and x has the master
Respected Advisor
Posts: 3,777

Re: Is there any way to delete duplicate records ina dataset

Are you sure?

What do you expect the output of this program to be?

[pre]
data have;
input a b c;
cards;
1 2 3
1 1 3
1 2 3
;;;;
run;
proc sort data=have nodup out=nodups;
by a;
run;
[/pre]

From the online doc.
[pre]
If you specify this option, then PROC SORT compares all variable values for each
observation to those for the previous observation that was written to the output data set.
If an exact match is found, then the observation is not written to the output data set.
[/pre]

It goes on to say using BY _ALL_ will result in the expected output...
N/A
Posts: 0

Re: Is there any way to delete duplicate records ina dataset

Hi,
You can use the following code, if you are not deleting on the basis of any key:

proc sql noprint;
create table Temp2
as
(select * from Temp1
union select * from Temp1);
quit;
run;
Ask a Question
Discussion stats
  • 5 replies
  • 176 views
  • 0 likes
  • 4 in conversation