BookmarkSubscribeRSS Feed
Justin9
Obsidian | Level 7

Hi, I've got a dataset (called 'RC.DATA_CHECK') that has two columns: Quarter and Rate. Please can someone provide me with the required code that would only keep one version of the Quarter value, based on the highest value of Rate? Note that 2021Q1 has three records, two of which are the same, but I would just like to keep one of these rows for this quarter. I have included a screenshot below of what I would like my dataset to look like after the dataset has been de-duplicated.

 

Justin9_1-1645999282081.png

 

 

7 REPLIES 7
PaigeMiller
Diamond | Level 26

Find the maximum value of rate, by quarter:

 

proc summary data=have nway;
    class quarter;
    var rate;
    output out=want max=;
run;

 

--
Paige Miller
Justin9
Obsidian | Level 7

Thanks for your reply. I've just realised that I've got another dataset that has a similar problem to what I posted, only that it has about an extra 100 columns along with the two (quarter and rate). If I still wanted to do the same thing (keep one record of quarter based on the highest value of rate, how would the code change (i.e. I want to have all of the c.100 columns in my dataset, with the only change being that only one unique quarter (based on highest rate) is shown with all of the information from the other c.100 columns for this row)?

Justin9
Obsidian | Level 7

Hi, can someone please tell me how the code would change if I wanted the same criteria (keep only one quarter based on the highest value of rate), but there are 100 variables/columns in the dataset (rather than just the two)?

Tom
Super User Tom
Super User

Sort the data and use FIRST. processing.

proc sort data=have out=sorted;
  by quarter descending rate;
run;

data want;
  set sorted;
  by quarter;
  if first.quarter;
run;

If there are variables you can use to help decide which one of the ties to pick add those to the BY statement in the proc sort step.

 

mkeintz
PROC Star

Your data appear to be sorted by quarter.  If so, then: 

 

data want (drop=max_rate);
  do until (last.quarter);
    set have;
    by quarter;
    max_rate=max(max_rate,rate);
  end;
  do until (last.quarter);
    set have;
    by quarter;
    if rate=max_rate then output;
  end;
run;

As long as no quarter has multiple observations with the maximum rate, this will produce what you want.

 

It works by reading all obs for each quarter twice.  The first time to determine the maximum rate.  The second time to re-read and output the obs with that maximum.

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
Justin9
Obsidian | Level 7

Thanks for your post. My example had 2021Q1 with two observations with the maximum rate (see original post), so does that impact on the code that you've just posted?

mkeintz
PROC Star

@Justin9 wrote:

Thanks for your post. My example had 2021Q1 with two observations with the maximum rate (see original post), so does that impact on the code that you've just posted?


Then which record with the tied rates do you want?

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 7 replies
  • 2843 views
  • 0 likes
  • 4 in conversation