Fluorite | Level 6

## how can I create a variable that indicates the percentile that each observation is in with weights?

I have data with a scaling factor, I want to analyze the distribution of the values , so I create a variable that indicates the quartile of the observation. This is explained here

`````` proc rank data=have groups=4 out=want;
var var1 var2;
ranks quartile_var1 quartile_var2;
run;``````

This code produces this, ignoring the scaling

My issue is, that I have an unrepresentative sample with some oversampling of certain demographics,  but I have a scaling factor/ weight to adjust for this. So I'd like to calculate the quartiles while adjusting for the oversampling with my scaling factor.

How to get  this is explained here:

``````proc means data=have p25 p50 p75;
weight wt;
var var1 var2;
run;``````

of course I could first calculate the quartiles cutoff values and then use if statements to build the variable but the rank procedure is so elegant.

rank doesn't seem to support weight though.

suggestions how to do this quickly with few lines of code are well appreciated

3 REPLIES 3
Opal | Level 21

## Re: how can I create a variable that indicates the percentile that each observation is in with weigh

A few lines of code :

``````proc univariate data=sashelp.heart noprint;
var ageAtStart height;
weight weight;
output out=quant p25=p_a25 p_h25 p50=p_a50 p_h50 p75=p_a75 p_h75;
run;

data heartQuant;
if _n_=1 then set quant;
array a p_a25 p_a50 p_a75;
array h p_h25 p_h50 p_h75;
set sashelp.heart;
if not missing(ageAtStart) then
do ageQuant = 1 to 3 until(ageAtStart < a{ageQuant}); end;
if not missing(height) then
do heightQuant = 1 to 3 until(height < h{heightQuant}); end;
drop p_: ;
run;
``````
PG
Fluorite | Level 6

## Re: how can I create a variable that indicates the percentile that each observation is in with weigh

thank you, I have to do this for about 20 variables.

I have to add them to the second line and then have to greate an array for each variable

and each it's own do statement.

What does the do statement do?

Opal | Level 21

## Re: how can I create a variable that indicates the percentile that each observation is in with weigh

The do loop compares the value with the quartile boundaries, in sequence 1(P25) 2(P50) 3(P75), it goes to 4 and stops if the value is greater than P75.

PG
Discussion stats
• 3 replies
• 900 views
• 0 likes
• 2 in conversation