BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
learn2
Fluorite | Level 6

Hi, this is a question that has evolved out of curiosity…just learning SAS! If you see the table below there are duplicates and assuming there is a hierarchy, I would want to retain the most severe case on the duplicate injury date in the order of Home(least severe)<Clinic<Hospital(more severe). How would I go about retaining only the most severe case if the injury date is the same?

 

ID

Injury date

Location of injury

1

Jan02,2007

Home

1

Jan02, 2007

Clinic

2

Feb01,2000

Home

2

Feb01,2000

Hospital

2

Feb01,2000

Home

1 ACCEPTED SOLUTION

Accepted Solutions
PeterClemmensen
Tourmaline | Level 20

There are many ways to do this. 

 

Here is an easy to understand approach

 

data have;
input ID $ Injury_date:$20. Locationofinjury $;
infile datalines dlm=' ';
datalines;
1 Jan02,2007 Home
1 Jan02,2007 Clinic
2 Feb01,2000 Home
2 Feb01,2000 Hospital
2 Feb01,2000 Home
;

data help;
   set have;
   if Locationofinjury="Home" then severity=1;
   else if Locationofinjury="Clinic" then severity=2;
   else severity=3;
run;

proc sort data=help;
   by ID severity;
run;

data want(drop=severity);
   set help;
   by ID severity;
   if last.ID;
run;

View solution in original post

5 REPLIES 5
PeterClemmensen
Tourmaline | Level 20

There are many ways to do this. 

 

Here is an easy to understand approach

 

data have;
input ID $ Injury_date:$20. Locationofinjury $;
infile datalines dlm=' ';
datalines;
1 Jan02,2007 Home
1 Jan02,2007 Clinic
2 Feb01,2000 Home
2 Feb01,2000 Hospital
2 Feb01,2000 Home
;

data help;
   set have;
   if Locationofinjury="Home" then severity=1;
   else if Locationofinjury="Clinic" then severity=2;
   else severity=3;
run;

proc sort data=help;
   by ID severity;
run;

data want(drop=severity);
   set help;
   by ID severity;
   if last.ID;
run;
learn2
Fluorite | Level 6

And what about if there are other unique (not duplicate) injury dates in the table too? Such as:

ID

Injury date

Location of injury

1

Jan02,2007

Home

1

Jan02, 2007

Clinic

2

Feb01,2000

Home

2

Feb06,2000

Hospital

2

Feb01,2000

Clinic

PeterClemmensen
Tourmaline | Level 20

Sort by injury_date as well and use last.injury date in the last data step like this

 

data have;
input ID $ Injury_date:$20. Locationofinjury $;
infile datalines dlm=' ';
datalines;
1 Jan02,2007 Home
1 Jan02,2007 Clinic
2 Feb01,2000 Home
2 Feb06,2000 Hospital
2 Feb01,2000 Home
;

data help;
   set have;
   if Locationofinjury="Home" then severity=1;
   else if Locationofinjury="Clinic" then severity=2;
   else severity=3;
run;

proc sort data=help;
   by ID Injury_date severity;
run;

data want(drop=severity);
   set help;
   by ID Injury_date severity;
   if last.Injury_date;
run;
Kurt_Bremser
Super User

Create a new numeric variable that us 1 for Home, 2 for clinic and 3 for Hospital.

Sort by id, date and this new variable.

Then run a data step with

by id date;
if last.date;
novinosrin
Tourmaline | Level 20

Hi @learn2   This is a situation where you typically need or better off having look up table in your DB/DWH design right in the schema. The look up table should be flexible enough to update periodically. Of course programatically fixing can be done but may not be a robust practice. 

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

Mastering the WHERE Clause in PROC SQL

SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 5 replies
  • 1679 views
  • 2 likes
  • 4 in conversation