BookmarkSubscribeRSS Feed
AkshayjitSinghNarula
Fluorite | Level 6

Hi,

 

I've been struggling to figure out a way to extract the top 20-30 observations, group wise, from a SAS dataset I'm working  on. 

Basically, I have 5 Indian cities in my dataset (Variable = City) and Net sales against each customer name. I want to pull out the top 20 - 30 observations based on the largest net sales values for each city automatically and subsequently create a separate dataset of those observations. Can someone help me here please? 🙂

2 REPLIES 2
jklaverstijn
Rhodochrosite | Level 12

There is some sample code in SAS note 24778 that explains how to do it in datastep. Depending on how you'd like to deal with ties (equal scores) PROC RANK may also be helpful:

 

proc sort data=sashelp.cars out=cars;
        by make;
run;
proc rank data=sashelp.cars ou=carsrank;
        by make;
        var horsepower;
        ranks hp_r;
run;
proc sort data=carsrank;
        by make hp_r;
        where hp_r <= 5;
run;

Actually, Google on "sas top n by group" gives a myriad of alternatives, each with its own virtues.

 

Good luck,

- Jan.

Reeza
Super User

1. Sort dataset by group and variable of interest

2. Add a counter to count the values

3. Take top N as desired

 

*Assuming N = 5, take top 5 tallest students by sex; 

*1 - sort by variables;
proc sort data=sashelp.class out=class;
by sex descending height;
run;


data want;
set class;
by sex;

*Create counter for each group;
if first.sex then count=1;
else count+1;

*Extract based on N;
if count<5;

run;

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 2 replies
  • 3560 views
  • 2 likes
  • 3 in conversation