BookmarkSubscribeRSS Feed
deleted_user
Not applicable
Hi,

I'd like to write some code that selects the 5 biggest firms in a given category. In my mind, it should go something like:

proc sort data=source;
by category size;
run;

proc sql;
create table newtable as select
avg(var1) as var1, avg(var2) as var2, sum(var3) as var3
from source
where (size is in the top 5)
group by category;
quit;

However, I'm not sure what the code to determine if it's one of the biggest 5 observations is. Any ideas?

Thanks!
2 REPLIES 2
statsplank
Calcite | Level 5
Hi tbalsky,

Please look at the following example. I explained steps in the comments.

/* original data set */
data firm; input category size @@;
datalines;
1 93458934 1 8467489 1 8679 2 56757 3 57850
1 456794 2 697058 2 59876 3 687957 3 698879
3 598679 1 8756356 1 356396 1 65365 2 56385
2 4686 2 469674967 2 49679674 3 4686 3 47696
3 76946794 3 64868
;

/* sort by category and then by size within each category in descending order */
proc sort data=firm;
by category DESCENDING size;
run;

/* create rank variable within each category */
data firm1; set firm;
by category;
rank+1;
if first.category then rank=1;
run;

/* data set that contains 5 biggest firm within each category */
data firm2; set firm1(where=(rank<6));
run;
deleted_user
Not applicable
Thank you so much! This worked perfectly.

sas-innovate-2026-white.png



April 27 – 30 | Gaylord Texan | Grapevine, Texas

Registration is open

Walk in ready to learn. Walk out ready to deliver. This is the data and AI conference you can't afford to miss.
Register now and save with the early bird rate—just $795!

Register now

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 2 replies
  • 2011 views
  • 0 likes
  • 2 in conversation