BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
anweinbe
Quartz | Level 8

Good morning,

 

I have a file of individuals names and their change in ownership %'s between 1%-100%. There are approximately 500 records. 

 

It looks like this:

Smith         5%

Jones        10%

Mary           7%.

 

I want to create a field that assigns each of the records to a higher level grouping if they are in the top 75th percentile of % change (labeled TOP), another group if they are in the 20th-75th (labeled middle), and a 3rd group if they are in 0-19th (labeled bottom). 

 

I tried proc means but that puts the stats into a table, it also doesn't help me with the middle groupings. 

 

Thank you!

1 ACCEPTED SOLUTION
4 REPLIES 4
Rick_SAS
SAS Super FREQ

You can use PROC FORMAT to recode a variable.

For your particular application, this is equivalent to using PROC FORMAT to bin a variable.

 

You can run PROC MEANS to get the percentiles and then enter them manually into the VALUE statement in PROC FORMAT. Or if you want to automate the process, you can use the CNTLIN= option on PROC FORMAT to specify the name of a SAS data set that defines the percentiles.

 

For the manual option, suppose from PROC MEANS that the cutpoints for the three groups are 0.02 and 0.08. Then the PROC FORMAT would look like this:

proc format;
value PctlFmt  
      low  -<  0.02  = "Bottom"
      0.02  -<  0.08 = "Middle"
      0.08  -   high = "Top"
      ;
run;

data Have;
length Name $10;
input Name $ pct;
format pct PERCENT8.2;
datalines;
Smith  0.05
Jones  0.10
Mary   0.07
Thomas 0.012
Xu     0.03
;

proc print data=Have;
format Pct PctlFmt.;
run;
anweinbe
Quartz | Level 8

Thank you.

 

Pardon my ignorance...I'm very new to this. The only proc means I know is as follows, which doesn't give me the percentages that I'm looking for. Is there a way to specify that I want 80th% instead of the Q3 (75th)?

 

proc means data=temp.execdata nolabels n q1 q3 mean median min max std;

 

 

Reeza
Super User

I would suggest using PROC RANK. If you want percentiles in terms of 5th or 10th percentiles use the appropriate number of groups. The benefit of using this method is that you don't have to figure out the cutoffs and then group the variables, it's done in a single step.


This will create a new variable, mpg_city_rank in the data set cars that will be the values 0 to 19 and 0 is the 0 to 5th percentile, and 19th would be 95th to 100th percentile idea. 

If you used groups =10 it would be the 0 to 10th percentile and so forth. 

 

proc rank data=sashelp.cars out=cars groups=20;
var mpg_city;
ranks mpg_city_rank;
run;

@anweinbe wrote:

Good morning,

 

I have a file of individuals names and their change in ownership %'s between 1%-100%. There are approximately 500 records. 

 

It looks like this:

Smith         5%

Jones        10%

Mary           7%.

 

I want to create a field that assigns each of the records to a higher level grouping if they are in the top 75th percentile of % change (labeled TOP), another group if they are in the 20th-75th (labeled middle), and a 3rd group if they are in 0-19th (labeled bottom). 

 

I tried proc means but that puts the stats into a table, it also doesn't help me with the middle groupings. 

 

Thank you!


 

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 4 replies
  • 3612 views
  • 1 like
  • 3 in conversation