BookmarkSubscribeRSS Feed
deleted_user
Not applicable
Hi,

I'd like to write some code that selects the 5 biggest firms in a given category. In my mind, it should go something like:

proc sort data=source;
by category size;
run;

proc sql;
create table newtable as select
avg(var1) as var1, avg(var2) as var2, sum(var3) as var3
from source
where (size is in the top 5)
group by category;
quit;

However, I'm not sure what the code to determine if it's one of the biggest 5 observations is. Any ideas?

Thanks!
2 REPLIES 2
statsplank
Calcite | Level 5
Hi tbalsky,

Please look at the following example. I explained steps in the comments.

/* original data set */
data firm; input category size @@;
datalines;
1 93458934 1 8467489 1 8679 2 56757 3 57850
1 456794 2 697058 2 59876 3 687957 3 698879
3 598679 1 8756356 1 356396 1 65365 2 56385
2 4686 2 469674967 2 49679674 3 4686 3 47696
3 76946794 3 64868
;

/* sort by category and then by size within each category in descending order */
proc sort data=firm;
by category DESCENDING size;
run;

/* create rank variable within each category */
data firm1; set firm;
by category;
rank+1;
if first.category then rank=1;
run;

/* data set that contains 5 biggest firm within each category */
data firm2; set firm1(where=(rank<6));
run;
deleted_user
Not applicable
Thank you so much! This worked perfectly.

SAS Innovate 2025: Register Now

Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 2 replies
  • 1499 views
  • 0 likes
  • 2 in conversation