Hello
I'm a super novice SAS SQL coder. I have a data set (call it test) that has that looks something like this:
ID Name
3 John Smith
3 Andrew Smith
1 John Doe
2 Bozo the Clown
2 Felix the Cat
What I want to do is is keep just one record for ID #2 and ID #3. I don't care which record, Name could be sorted alphabetically and I could keep the first or last so I get back this:
ID Name
1 John Doe
2 Felix the Cat
3 John Smith
I can do this in Access, but I have millions of records and I think SAS is a much better tool to do this.
Thanks in advance.
data have; input ID Name $30.; cards; 3 John Smith 3 Andrew Smith 1 John Doe 2 Bozo the Clown 2 Felix the Cat ; proc sql; create table want as select * from have group by id having name=max(name); quit;
Just sort and use BY group processing in a data step.
data have;
input ID Name $30.;
cards;
3 John Smith
3 Andrew Smith
1 John Doe
2 Bozo the Clown
2 Felix the Cat
;
proc sort data=have out=want;
by id name;
run;
data want;
set want;
by id;
if last.id;
run;
Thank you for the super quick response Tom!
data have; input ID Name $30.; cards; 3 John Smith 3 Andrew Smith 1 John Doe 2 Bozo the Clown 2 Felix the Cat ; proc sql; create table want as select * from have group by id having name=max(name); quit;
If there multiple observations with the same value of name for an ID you could end up with multiple observations for the same value of ID.
It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.