Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

- Home
- /
- Programming
- /
- SAS Procedures
- /
- Calculating percentile for a variable but based on another variable?

- RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page

🔒 This topic is **solved** and **locked**.
Need further help from the community? Please
sign in and ask a **new** question.

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Posted 08-27-2020 04:57 AM
(873 views)

Hello,

I was wondering if it is possible to calculate percentiles of a variable, not for itself, but relative to the values of another variable?

Consider the data below:

```
data neighborhood_money;
input Name $ 1-7 Gender $ 9 Poor 11-14 Rich 16-19;
datalines;
Andrea F 28.6 .
Carole F 24.0 24.0
Clayton M 27.0 .
Curtis M 22.6 22.6
Doug M 27.3 27.3
Ellen F 27.0 27.0
Jan F 31.2 31.2
Jimmy M 22.4 22.5
Karin F 34.6 .
Mick M 29.0 .
Richard M 29.7 .
Sam M 27.2 .
Susan F 35.1 .
;
proc sort data=neighborhood_money out=pairs;
by gender;
run;
```

If I use proc rank as follows:

```
proc rank data=pairs out=rankpair PERCENT;
by gender;
var poor rich;
run;
```

I will get percentile values, by gender, for "poor" and "rich". However, the percentile value for "poor" is based on other "poor" values, and similarly for "rich" percentiles.

Is there instead a way, that I can get percentile values for "poor" relative to the values of "rich"?

**What I want**:

So for Andrea, who has Poor = 28.6, when calculating the percentile, I only want to consider how Andrea's value of 28.6 is ranked against the other entries in the "Rich" column, for gender = F.

So Andrea's percentile for poor= 28.6 is calculated by looking at: Carole = 24.0 , Ellen = 27.0, and Jan = 31.2, as these women have values for the "Rich" column.

But she is not compared to the other two females, Karin = 34.6 (poor) and Susan = 35.1 (poor) or any of the men.

Is this possible to do? Calculate ranks based on other variables?

1 ACCEPTED SOLUTION

Accepted Solutions

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Let me see if I understand this. Since you are doing a BY-group analysis, you only need to look at one gender at a time. Let's look at females:

```
data F;
input Name $ 1-7 Gender $ 9 Poor 11-14 Rich 16-19;
datalines;
Andrea F 28.6 .
Carole F 24.0 24.0
Ellen F 27.0 27.0
Jan F 31.2 31.2
Karin F 34.6 .
Susan F 35.1 .
;
```

Now you want to compute the rank of the 'Poor' score with respect to the nonmissing 'Rich' scores.

The Rich scores are always {24, 27, 31.2}. So you want

Andrea: 28.6 is what percentile in {24, 27, 28.6, 31.2}

Carol: 24 is what percentile in {24, 27, 31.2}. (OR do we use {24, 24, 27, 31.2} ?)

...

Susan: 35.1 is what percentile in {24, 27, 31.2, 35.1}.

The short answer is that I do not think this is a computation built into any SAS procedure. I think you will have to program it yourself in the DATA step or PROC IML (much easier in IML).

4 REPLIES 4

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

And for your example of Andrea what is the resulting "percentile" you expect?

Also you should provide an example of what you expect the output data set or report to look like.

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

If we are ranking Andrea's value of 28.6 against the values: 24.0, 27.0, and 31.2, Andrea's percentile rank should = ~67 (66.666*), given that 2/3 of the sample are smaller than 28.6.

Thanks!

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Let me see if I understand this. Since you are doing a BY-group analysis, you only need to look at one gender at a time. Let's look at females:

```
data F;
input Name $ 1-7 Gender $ 9 Poor 11-14 Rich 16-19;
datalines;
Andrea F 28.6 .
Carole F 24.0 24.0
Ellen F 27.0 27.0
Jan F 31.2 31.2
Karin F 34.6 .
Susan F 35.1 .
;
```

Now you want to compute the rank of the 'Poor' score with respect to the nonmissing 'Rich' scores.

The Rich scores are always {24, 27, 31.2}. So you want

Andrea: 28.6 is what percentile in {24, 27, 28.6, 31.2}

Carol: 24 is what percentile in {24, 27, 31.2}. (OR do we use {24, 24, 27, 31.2} ?)

...

Susan: 35.1 is what percentile in {24, 27, 31.2, 35.1}.

The short answer is that I do not think this is a computation built into any SAS procedure. I think you will have to program it yourself in the DATA step or PROC IML (much easier in IML).

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

You are correct.

My solution was to separate the "rich" group into a separate table, then using left join in proc sql and using count(b.name) to find the number of people that meet the condition that a.poor ge b.rich , grouping by gender.

This gave me the number of people 'rich' below each 'poor', by gender, and then I manually computed a percentile by dividing by the total number of rich.

Seemed like the easiest way to go about it.

My solution was to separate the "rich" group into a separate table, then using left join in proc sql and using count(b.name) to find the number of people that meet the condition that a.poor ge b.rich , grouping by gender.

This gave me the number of people 'rich' below each 'poor', by gender, and then I manually computed a percentile by dividing by the total number of rich.

Seemed like the easiest way to go about it.

Registration is open! SAS is returning to Vegas for an AI and analytics experience like no other! Whether you're an executive, manager, end user or SAS partner, SAS Innovate is designed for everyone on your team. Register for just $495 by 12/31/2023.

**If you are interested in speaking, there is still time to submit a session idea. More details are posted on the website. **

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.