BookmarkSubscribeRSS Feed
raqthesolid
Quartz | Level 8

Hello Folks, 
I have a data with Firm, Year, and ROA variables.
MY ROA variables ranges from -50% to +50%. 
I want to create ranges using ROA which looks like

 

Range                   Percentage

less than 0             percentage of firms in this category 

0 to1%                    percentage of firms in this category 

2 to 3%                  percentage of firms in this category 

4 to 5%                  percentage of firms in this category 

......50%                 percentage of firms in this category 

 

Data sample:

Firm                    Year     ROA

PK0000201019 2016 11.70255
PK0000301017 2016 9.428861
PK0000401015 2016 -5.80082
PK0000601010 2016 17.6943
PK0001101010 2016 -8.47214
PK0001301016 2016 20.18228
PK0001901013 2016 11.68819
PK0002001011 2016 13.31986
PK0002101019 2016 2.865007
PK0002201017 2016 9.238791
PK0002301015 2016 19.7908
PK0002501010 2016 1.970355
PK0002701016 2016 64.1466
PK0002901012 2016 12.86187
PK0003001010 2016 -3.51228
PK0003101018 2016 19.38653
PK0003201016 2016 3.417498
PK0003301014 2016 22.99802
PK0003401012 2016 -10.3915
PK0003801013 2016 15.53839
PK0003901011 2016 4.804186
PK0004101017 2016 7.54267
PK0004301013 2016 22.41655
PK0004401011 2016 9.554636
PK0004701014 2016 8.438554
PK0004801012 2016 8.95593
PK0004901010 2016 7.656038
PK0005201014 2016 50.03889
PK0005401010 2016 -14.8514
PK0005501017 2016 8.194166
PK0005701013 2016 25.27504
PK0006001017 2016 3.329774
PK0006101015 2016 5.472684
PK0006801010 2016 10.02528
PK0006901018 2016 9.398788
PK0007101014 2016 29.06179
PK0008101013 2016 5.884868
PK0008201011 2016 18.7473
PK0008401017 2016 14.02341
PK0008501014 2016 7.427927
PK0009201010 2016 21.76193
PK0009401016 2016 -7.2675
PK0009501013 2016 -0.98955
PK0009801017 2016 30.41894
PK0010001011 2016 5.148667
PK0010301015 2016 -2.02785
PK0010501010 2016 14.67657
PK0010901012 2016 12.50834
PK0011001010 2016 6.203303
PK0011101018 2016 16.16196
PK0011201016 2016 22.07683
PK0011301014 2016 33.96201
PK0011501019 2016 -6.85826
PK0011701015 2016 34.67594
PK0012101017 2016 12.91066
PK0012401011 2016 10.27558
PK0012501018 2016 14.37119
PK0013401010 2016 4.465276
PK0014201013 2016 9.261005
PK0014501016 2016 9.95184

 

Thanks for any help.

 

3 REPLIES 3
SylvainTremblay
SAS Employee

Hello,

 

A simple strategy here would be to create and use User Defined Formats in SAS, to create your ranges.

 

Here is a reference:

Building and Using User Defined Formats

https://support.sas.com/resources/papers/proceedings/proceedings/sugi29/236-29.pdf

 

Sylvain

Reeza
Super User
What happened to 1 to 2%?

I would use FLOOR()/CEIL() to floor the variable to the nearest integer and use that since your ranges are 1's essentially. If your ranges encompass more than one number you could either use MOD() to account for that or you could create custom formats.

Or just apply a format to it.

proc freq data=have;
table roa;
format roa 8.;
run;
StatDave
SAS Super FREQ

Any time that you want to assign observations into quantile groups, of any consistent width, the easiest way to do that is with the GROUPS= option in PROC RANK. By specifying GROUPS=x, you create a variable with value 0, 1, 2, ... , x indicating which x-tile each observation is in. So, GROUPS=4 creates a variable indicating which quartile each observation is in. In your case, if you want an indicator of each percentile, specify GROUPS=100. For example, these statements create data set MYRANKS which is a copy of you input data set and includes the variable ROApctl with values 0,1, 2, 3, ... , 99 which indicates the percentile range for each observation.

 

proc rank groups=100 out=myranks;

var ROA; ranks ROApctl;

run; 

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 16. Read more here about why you should contribute and what is in it for you!

Submit your idea!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 3 replies
  • 989 views
  • 1 like
  • 4 in conversation