BookmarkSubscribeRSS Feed
deleted_user
Not applicable
Hi,

I'd like to write some code that selects the 5 biggest firms in a given category. In my mind, it should go something like:

proc sort data=source;
by category size;
run;

proc sql;
create table newtable as select
avg(var1) as var1, avg(var2) as var2, sum(var3) as var3
from source
where (size is in the top 5)
group by category;
quit;

However, I'm not sure what the code to determine if it's one of the biggest 5 observations is. Any ideas?

Thanks!
2 REPLIES 2
statsplank
Calcite | Level 5
Hi tbalsky,

Please look at the following example. I explained steps in the comments.

/* original data set */
data firm; input category size @@;
datalines;
1 93458934 1 8467489 1 8679 2 56757 3 57850
1 456794 2 697058 2 59876 3 687957 3 698879
3 598679 1 8756356 1 356396 1 65365 2 56385
2 4686 2 469674967 2 49679674 3 4686 3 47696
3 76946794 3 64868
;

/* sort by category and then by size within each category in descending order */
proc sort data=firm;
by category DESCENDING size;
run;

/* create rank variable within each category */
data firm1; set firm;
by category;
rank+1;
if first.category then rank=1;
run;

/* data set that contains 5 biggest firm within each category */
data firm2; set firm1(where=(rank<6));
run;
deleted_user
Not applicable
Thank you so much! This worked perfectly.

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!

Submit your idea!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 2 replies
  • 1478 views
  • 0 likes
  • 2 in conversation