BookmarkSubscribeRSS Feed
deleted_user
Not applicable
Hi,

I'd like to write some code that selects the 5 biggest firms in a given category. In my mind, it should go something like:

proc sort data=source;
by category size;
run;

proc sql;
create table newtable as select
avg(var1) as var1, avg(var2) as var2, sum(var3) as var3
from source
where (size is in the top 5)
group by category;
quit;

However, I'm not sure what the code to determine if it's one of the biggest 5 observations is. Any ideas?

Thanks!
2 REPLIES 2
statsplank
Calcite | Level 5
Hi tbalsky,

Please look at the following example. I explained steps in the comments.

/* original data set */
data firm; input category size @@;
datalines;
1 93458934 1 8467489 1 8679 2 56757 3 57850
1 456794 2 697058 2 59876 3 687957 3 698879
3 598679 1 8756356 1 356396 1 65365 2 56385
2 4686 2 469674967 2 49679674 3 4686 3 47696
3 76946794 3 64868
;

/* sort by category and then by size within each category in descending order */
proc sort data=firm;
by category DESCENDING size;
run;

/* create rank variable within each category */
data firm1; set firm;
by category;
rank+1;
if first.category then rank=1;
run;

/* data set that contains 5 biggest firm within each category */
data firm2; set firm1(where=(rank<6));
run;
deleted_user
Not applicable
Thank you so much! This worked perfectly.

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 2 replies
  • 1358 views
  • 0 likes
  • 2 in conversation