BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
Didi_b
Obsidian | Level 7

Hello,

I am trying to remove duplicate value by var ID but I also need to the highest value of the var kid_age for the obs that have duplicate. Is it possible? Could someone help me please?

I need to have only 1 Obs for each ID but also have the highest value for the var kid_age for the ID that listed more than once in my data.

 

This is the SAS programme I used to remove duplicate:

Proc sort data=Work.base
out=clean   
dupout=dups   
nodupkey;
by ID;
run;

data example :

ID name age kid_age
XD546 Alex 18 1
GT786 Yvan 35 10
PE358 Sami 25 5
LK523 Yan 40 18
LK523 Yan 40 15
UY841 Doris 28 14
UY841 Doris 28 8
PQ153 Suzi 38 16
PQ153 Suzi 38 8
PQ153 Suzi 38 3

 

1 ACCEPTED SOLUTION

Accepted Solutions
Reeza
Super User

Double sort or use SQL to control the aggregations more. It really depends on whether you can assume that Name and Age will be constant along with ID. 

 

See this slightly modified example data process (Suza versus Suzi).

SQL offers you a little more control/options.

 

data example;
input ID $ name $ age kid_age;
cards;
XD546 Alex 18 1
GT786 Yvan 35 10
PE358 Sami 25 5
LK523 Yan 40 18
LK523 Yan 40 15
UY841 Doris 28 14
UY841 Doris 28 8
PQ153 Suzi 38 16
PQ153 Suza 38 8
PQ153 Suzi 38 3
;;;;
run;

proc sort data=example;
by id name age descending kid_age ;
run;


Proc sort data=example
out=clean   
dupout=dups   
nodupkey;
by ID;
run;

proc print data=example;run;

proc print data=clean;run;

proc sql;
create table clean_want as
select id, max(name) as name, max(age) as age, max(kid_age) as kid_age
from example
group by id;
quit;

proc print data=clean_want;
run;

 

 

View solution in original post

2 REPLIES 2
Reeza
Super User

Double sort or use SQL to control the aggregations more. It really depends on whether you can assume that Name and Age will be constant along with ID. 

 

See this slightly modified example data process (Suza versus Suzi).

SQL offers you a little more control/options.

 

data example;
input ID $ name $ age kid_age;
cards;
XD546 Alex 18 1
GT786 Yvan 35 10
PE358 Sami 25 5
LK523 Yan 40 18
LK523 Yan 40 15
UY841 Doris 28 14
UY841 Doris 28 8
PQ153 Suzi 38 16
PQ153 Suza 38 8
PQ153 Suzi 38 3
;;;;
run;

proc sort data=example;
by id name age descending kid_age ;
run;


Proc sort data=example
out=clean   
dupout=dups   
nodupkey;
by ID;
run;

proc print data=example;run;

proc print data=clean;run;

proc sql;
create table clean_want as
select id, max(name) as name, max(age) as age, max(kid_age) as kid_age
from example
group by id;
quit;

proc print data=clean_want;
run;

 

 

Michelleazevedo
Fluorite | Level 6
Proc sort data=Work.base;
by ID descending kid_age;
run;

Proc sort data=Work.base
out=clean   
dupout=dups   
nodupkey;
by ID;
run;

 

if you want just duplicated values:

Proc sort data=Work.base
out=dups   
nouniquekey;
by ID;
run;

with dupout you keep just deleted values on dups.

 

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!

Submit your idea!

Mastering the WHERE Clause in PROC SQL

SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 2 replies
  • 479 views
  • 2 likes
  • 3 in conversation