Hi,
suppose I have the following table:
ID | Name |
---|---|
1 | Mike |
1 | Mike |
2 | George |
3 | Jack |
3 | Jack |
4 | Tan |
Is it possible to delete the duplicate rows for IDs 1 and 3, but the rows where there are no duplicates, like those for IDs 2 and 4, to keep them as they are?
Thank you,
data want;
set have;
by id;
if first.id;
run;
Hi stat@sas,
I did the following including your code:
data dup;
input id name$;
datalines;
3 a
3 a
1 d
5 e
5 e
4 y
2 t
2 t
;
run;
data dup2;
set dup;
by id;
if first.id;
run;
But I get an error message:
ERROR 180-322: BY variables not properly sorted on dataset DUP
And the result that I get is:
id | name | |
1 | 3 | a |
It seems that the duplicate was deleted for the first row, but after that the code stopped functioning
Thank you
You need to sort dataset dup by id before performing by processing
proc sort data=dup;
by id;
run;
then try this
data dup2;
set dup;
by id;
if first.id;
run;
In this case, who not just:
proc sort data=dup nodupkey;
by id;
run;
Thanks Haikuo - Yes, this is a better solution.
Thank you Hai.kuo and stat@sas !!!
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.
Need to connect to databases in SAS Viya? SAS’ David Ghan shows you two methods – via SAS/ACCESS LIBNAME and SAS Data Connector SASLIBS – in this video.
Find more tutorials on the SAS Users YouTube channel.