Hi,
suppose I have the following table:
ID | Name |
---|---|
1 | Mike |
1 | Mike |
2 | George |
3 | Jack |
3 | Jack |
4 | Tan |
Is it possible to delete the duplicate rows for IDs 1 and 3, but the rows where there are no duplicates, like those for IDs 2 and 4, to keep them as they are?
Thank you,
data want;
set have;
by id;
if first.id;
run;
Hi stat@sas,
I did the following including your code:
data dup;
input id name$;
datalines;
3 a
3 a
1 d
5 e
5 e
4 y
2 t
2 t
;
run;
data dup2;
set dup;
by id;
if first.id;
run;
But I get an error message:
ERROR 180-322: BY variables not properly sorted on dataset DUP
And the result that I get is:
id | name | |
1 | 3 | a |
It seems that the duplicate was deleted for the first row, but after that the code stopped functioning
Thank you
You need to sort dataset dup by id before performing by processing
proc sort data=dup;
by id;
run;
then try this
data dup2;
set dup;
by id;
if first.id;
run;
In this case, who not just:
proc sort data=dup nodupkey;
by id;
run;
Thanks Haikuo - Yes, this is a better solution.
Thank you Hai.kuo and stat@sas !!!
Don't miss out on SAS Innovate - Register now for the FREE Livestream!
Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.
Need to connect to databases in SAS Viya? SAS’ David Ghan shows you two methods – via SAS/ACCESS LIBNAME and SAS Data Connector SASLIBS – in this video.
Find more tutorials on the SAS Users YouTube channel.