Hi,
suppose I have the following table:
ID | Name |
---|---|
1 | Mike |
1 | Mike |
2 | George |
3 | Jack |
3 | Jack |
4 | Tan |
Is it possible to delete the duplicate rows for IDs 1 and 3, but the rows where there are no duplicates, like those for IDs 2 and 4, to keep them as they are?
Thank you,
data want;
set have;
by id;
if first.id;
run;
Hi stat@sas,
I did the following including your code:
data dup;
input id name$;
datalines;
3 a
3 a
1 d
5 e
5 e
4 y
2 t
2 t
;
run;
data dup2;
set dup;
by id;
if first.id;
run;
But I get an error message:
ERROR 180-322: BY variables not properly sorted on dataset DUP
And the result that I get is:
id | name | |
1 | 3 | a |
It seems that the duplicate was deleted for the first row, but after that the code stopped functioning
Thank you
You need to sort dataset dup by id before performing by processing
proc sort data=dup;
by id;
run;
then try this
data dup2;
set dup;
by id;
if first.id;
run;
In this case, who not just:
proc sort data=dup nodupkey;
by id;
run;
Thanks Haikuo - Yes, this is a better solution.
Thank you Hai.kuo and stat@sas !!!
Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.
Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.
Need to connect to databases in SAS Viya? SAS’ David Ghan shows you two methods – via SAS/ACCESS LIBNAME and SAS Data Connector SASLIBS – in this video.
Find more tutorials on the SAS Users YouTube channel.