Hello Folks,
I have a data with Firm, Year, and ROA variables.
MY ROA variables ranges from -50% to +50%.
I want to create ranges using ROA which looks like
Range Percentage
less than 0 percentage of firms in this category
0 to1% percentage of firms in this category
2 to 3% percentage of firms in this category
4 to 5% percentage of firms in this category
......50% percentage of firms in this category
Data sample:
Firm Year ROA
PK0000201019 | 2016 | 11.70255 |
PK0000301017 | 2016 | 9.428861 |
PK0000401015 | 2016 | -5.80082 |
PK0000601010 | 2016 | 17.6943 |
PK0001101010 | 2016 | -8.47214 |
PK0001301016 | 2016 | 20.18228 |
PK0001901013 | 2016 | 11.68819 |
PK0002001011 | 2016 | 13.31986 |
PK0002101019 | 2016 | 2.865007 |
PK0002201017 | 2016 | 9.238791 |
PK0002301015 | 2016 | 19.7908 |
PK0002501010 | 2016 | 1.970355 |
PK0002701016 | 2016 | 64.1466 |
PK0002901012 | 2016 | 12.86187 |
PK0003001010 | 2016 | -3.51228 |
PK0003101018 | 2016 | 19.38653 |
PK0003201016 | 2016 | 3.417498 |
PK0003301014 | 2016 | 22.99802 |
PK0003401012 | 2016 | -10.3915 |
PK0003801013 | 2016 | 15.53839 |
PK0003901011 | 2016 | 4.804186 |
PK0004101017 | 2016 | 7.54267 |
PK0004301013 | 2016 | 22.41655 |
PK0004401011 | 2016 | 9.554636 |
PK0004701014 | 2016 | 8.438554 |
PK0004801012 | 2016 | 8.95593 |
PK0004901010 | 2016 | 7.656038 |
PK0005201014 | 2016 | 50.03889 |
PK0005401010 | 2016 | -14.8514 |
PK0005501017 | 2016 | 8.194166 |
PK0005701013 | 2016 | 25.27504 |
PK0006001017 | 2016 | 3.329774 |
PK0006101015 | 2016 | 5.472684 |
PK0006801010 | 2016 | 10.02528 |
PK0006901018 | 2016 | 9.398788 |
PK0007101014 | 2016 | 29.06179 |
PK0008101013 | 2016 | 5.884868 |
PK0008201011 | 2016 | 18.7473 |
PK0008401017 | 2016 | 14.02341 |
PK0008501014 | 2016 | 7.427927 |
PK0009201010 | 2016 | 21.76193 |
PK0009401016 | 2016 | -7.2675 |
PK0009501013 | 2016 | -0.98955 |
PK0009801017 | 2016 | 30.41894 |
PK0010001011 | 2016 | 5.148667 |
PK0010301015 | 2016 | -2.02785 |
PK0010501010 | 2016 | 14.67657 |
PK0010901012 | 2016 | 12.50834 |
PK0011001010 | 2016 | 6.203303 |
PK0011101018 | 2016 | 16.16196 |
PK0011201016 | 2016 | 22.07683 |
PK0011301014 | 2016 | 33.96201 |
PK0011501019 | 2016 | -6.85826 |
PK0011701015 | 2016 | 34.67594 |
PK0012101017 | 2016 | 12.91066 |
PK0012401011 | 2016 | 10.27558 |
PK0012501018 | 2016 | 14.37119 |
PK0013401010 | 2016 | 4.465276 |
PK0014201013 | 2016 | 9.261005 |
PK0014501016 | 2016 | 9.95184 |
Thanks for any help.
Hello,
A simple strategy here would be to create and use User Defined Formats in SAS, to create your ranges.
Here is a reference:
Building and Using User Defined Formats
https://support.sas.com/resources/papers/proceedings/proceedings/sugi29/236-29.pdf
Sylvain
Any time that you want to assign observations into quantile groups, of any consistent width, the easiest way to do that is with the GROUPS= option in PROC RANK. By specifying GROUPS=x, you create a variable with value 0, 1, 2, ... , x indicating which x-tile each observation is in. So, GROUPS=4 creates a variable indicating which quartile each observation is in. In your case, if you want an indicator of each percentile, specify GROUPS=100. For example, these statements create data set MYRANKS which is a copy of you input data set and includes the variable ROApctl with values 0,1, 2, 3, ... , 99 which indicates the percentile range for each observation.
proc rank groups=100 out=myranks;
var ROA; ranks ROApctl;
run;
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.
Find more tutorials on the SAS Users YouTube channel.