BookmarkSubscribeRSS Feed
leahcho
Obsidian | Level 7

Hi,

 

I have a dataset

 

ID   age     var1

1     30       1

1     30       2

1     32       1

1     33       3

How do I remove duplicate ID and age ? It doesn't matter if I keep the first or last of duplicate

So the I want my table to look like

 

ID age var1

1   30    1

1   32    1

1   33    3

 

Thanks

4 REPLIES 4
andreas_lds
Jade | Level 19
Look up the documentation of proc sort, especially the nodule option. Using Id and age as by-vars should solve the problem.
Shmuel
Garnet | Level 18

You can remove duplicats, either by

proc sort data=have out=want NODUPKEY;
  by id age;
run;

or by SQL:

proc sql;
    create table want as
    select * from have
    group by id, age
; quit;

If you compare the rwo methods, you'll find that -

the one keeps the 1st occurence while the other keeps the last occurence.

As much as I remeber, SQL keeps the 1st occurence.

 

Astounding
PROC Star

Given that your data set is already in sorted order by ID AGE, you could try:

 

data want;

set have;

by id age;

if first.age;

run;

 

It's worth spending time on the BY statement in a DATA step and how it creates FIRST.AGE (and a few more variables).  Those will be tools you use over and over again.

DavyJones
Obsidian | Level 7

proc sort data=have; 

by ID age var1;

run;

 

data want;

set have;

by ID age var1;

if first.var1;

run;

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 4 replies
  • 9017 views
  • 0 likes
  • 5 in conversation