- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Dear All:
I want to get decile groups of multiple variables
The code I am using is
Proc sort data = have ; by M ; run;
Data Want; set Have NOBS = NUMOBS ;
Decile_M = FLOOR(_N_*10 /(NUMOBS + 1)) ;
run;
But I want these deciles for at least 10 more variables. I can find the deciles one variable at a time. Is there an efficient way of constructing the deciles.
Thanx.
Randy
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
PROC RANK with the GROUPS=10 option if you want to assign observations to Deciles.
PROC UNIVARIATE if you just want the decile values.
Paige Miller
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
PROC Rank is one of the fastest ways to create deciles in a batch. (The groups=10 option requests the deciles.)
proc rank data=have out=deciles ties=low descending groups=10;
var M;
ranks M_score;
run;
I use this paper when I need to grab code to do this myself:
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Proc summary will calculate the overall quantiles.
Example:
proc summary data=have; var v1 v2 v3; output out=work.summary p10= p20= p30= p40= p50= p60= p70= p80= p90= /autoname; run;
Will create the percentiles indicated. The variables will be named v1_<suffix of the percentile>.
Put the variables you want on the var statement. Or don't and you will get percentiles of all the numeric variables. There is a technical concern about which definition may be used for the quantiles whic is controlled by the option QNTLDEF= <a value from 1 to 5>. The default is 5. The details have to do with boundaries for the quantiles and tie breakers.
If you want to know which decile a particular value falls into use Proc Rank with groups=10. Which will add a ranking variable with values of 0 to 9, 0 indicating the value is in the first decile and 9 in the 10th decile.
proc rank data=have out=want groups=10; var v1 v2 v3; ranks rv1 rv2 rv3; run;
The output data set will have the decile for v1 indicated in the variable rv1 and so forth.
There is an option for the procedure TIES= that has parameter values of high, low, mean or dense which deals with how to assign ranks for tied values. The default is MEAN