SAS Programming

DATA Step, Macro, Functions and more
BookmarkSubscribeRSS Feed
Ronein
Meteorite | Level 14

Hello

I want to calculate following percentiles and then classify each row into a group.

I want to define groups (Ranks) based on following percentiles-

P5,P10,P20,P30,P40,P50,P60,P70,P80,P90,P95

 

The following code  create the percentiles-P10,P20,P30,P40,P50,P60,P70,P80,P90   but no P5,P95

What is the way to add them???

proc rank data=sashelp.cars out=have groups=10;
var invoice;
ranks rank;
run;


proc sql;
create table rank_groups_Definition as
select rank,
		min(invoice) as min,
		max(invoice) as max
from have
group by rank
;
quit;


data rank_groups_Definition_2;
set rank_groups_Definition;
_MIN_=lag(max)+1;
IF _MIN_=. then _MIN_=0;
drop min;
rename _min_=min;
Run;


proc sql;
create table want as
select a.*,b.rank
from sashelp.cars as a
left join rank_groups_Definition_2 as b
on    b.Min<=a.invoice<=b.Max
;
quit;

6 REPLIES 6
PaigeMiller
Diamond | Level 26
proc rank data=sashelp.cars out=have fraction;
var invoice;
ranks rank;
run;

proc format;
	value percentile 0-<0.05="< 5 percentile"
	    0.05-<0.1='5-10 percentile'
	    0.1-<0.2='10-20 percentile'
	    /* I'm lazy, you type the rest */
	    ...
    ;
run;

proc datasets library=work nolist;
    modify have;
    format rank percentile.;
run; quit;

 

Also, stay away from mathematical analyses in SQL steps when possible. In the long run, SQL is a poor choice for math. Just because you can do things in SQL does not mean you should do things in SQL. In this case and many others, PROC FORMAT is a better solution. Using PROC FORMAT allows you to put labels on the group that actually mean something to humans and it keeps the variable numeric (instead of 'P5' 'P10' etc.) so it will sort numerically in tables (instead of sorting alphabetically).

--
Paige Miller
Ksharp
Super User

Split your data into 100 groups. And recode it as your wish.

proc rank data=sashelp.cars out=have groups=100;
var invoice;
ranks rank;
run;
data have;
set have;
if rank<=4 then want='P5 ';
else if rank<=9 then want='P10 ';
else if rank<=19 then want='P20 ';
.........................


Ronein
Meteorite | Level 14

I am using it on my data set and I dont get the whole percentiles that I ask .


proc rank data=temp(Where=(revenue>0)) out=want groups=100;
var revenue;
ranks rank;
run;
PaigeMiller
Diamond | Level 26

@Ronein wrote:

I am using it on my data set and I dont get the whole percentiles that I ask .


What does that mean?

--
Paige Miller
ballardw
Super User

Show the log.

 

If you don't have more than at least 100 non-missing non-tied values in the input set you are not going to get 100 ranks. Period.

Ksharp
Super User
As Ballardw said ,maybe your data do not have unique 100 value , Try groups=20:

proc rank data=sashelp.cars out=have groups=20;
var invoice;
ranks rank;
run;
data have;
set have;
if rank=0 then want='P5 ';
else if rank=1 then want='P10 ';
else if rank in (2 3) then want='P20 ';
.........................

sas-innovate-white.png

Our biggest data and AI event of the year.

Don’t miss the livestream kicking off May 7. It’s free. It’s easy. And it’s the best seat in the house.

Join us virtually with our complimentary SAS Innovate Digital Pass. Watch live or on-demand in multiple languages, with translations available to help you get the most out of every session.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 6 replies
  • 1269 views
  • 0 likes
  • 4 in conversation