## Modified Percentiles by Group

Solved
Occasional Contributor
Posts: 11

# Modified Percentiles by Group

Thanks in advance for the help. I have a dataset that looks like the below, with a large number of simulated scenario results by Group and Subjects.

I am trying to calculate a modified percentile by Group or Subject. By modified I mean:

- A percentile (e.g., 99.5th) by Group or Subject across all scenarios

- An average of the 50 observations around the specified (e.g., 99.5th) percentile to be used in place of the single datapoint corresponding to the percentile; or even like avg of the observations that fall within the 99.25th and 99.75th percentile

How can be achieved?

 Scenarios Groups Subject Results 1 A Tom 100 1 A Tim 110 1 A Ted 120 1 B Bob 110 1 B Ben 100 1 B Bill 90 1 … … … 2 A Tom 105 2 A Tim 115 2 A Ted 125 2 B Bob 115 2 B Ben 105 2 B Bill 95 2 … … …

Accepted Solutions
Solution
‎10-26-2015 02:55 PM
Super User
Posts: 23,700

## Re: Modified Percentiles by Group

I don't think so but it wouldn't be hard to manually accomplish. You can use proc rank to assign 1000 ranks to each group and then compute the average within a proc means.

All Replies
Super User
Posts: 23,700

## Re: Modified Percentiles by Group

How much data do you have? You'll need a lot of observations PER GROUP/SUBJECT to differentiate the 99.25 from the 99.75th percentile.
Super User
Posts: 13,523

## Re: Modified Percentiles by Group

You may be looking for a Trimmed Mean.  Proc univariate will do trimmed means using the TRIM option. But I think you need to consider what you mean by "within the 99.25th and 99.75th percentile". How many records do you have for any group or subject? If there are 100 then you do not have any in that range as they would go from 99th to 100th percentile.

Can you provide what you would expect for the output of the example data you provided?

Occasional Contributor
Posts: 11

## Re: Modified Percentiles by Group

The dataset is fairly large, with results for about 10,000 scenarios, each with 7 groups having 25,000 subjects in each group. So total row size would be about 10k x 7 x 25k.

I would be calculating the percentile (99.5th) statistics for:

• Aggregate results (all groups and subjects) across the 10,000 scenarios
• Aggregate for a particular group across the scenarios
• A particular subject across the scenarios

By the modified percentile, the output I am expecting would be, using subject Tom as example:
If the results for Tom across all scenarios are sorted {1 [smallest], 2, 3, 4... 9925,... 9975... 10000th [largest])

My modified 99.5th percentile results for Tom would be the average of all observations between 9925 and 9975th observations.

Is there such a procedure within SAS that can calculate such modified percentile by group?

Solution
‎10-26-2015 02:55 PM
Super User
Posts: 23,700

## Re: Modified Percentiles by Group

I don't think so but it wouldn't be hard to manually accomplish. You can use proc rank to assign 1000 ranks to each group and then compute the average within a proc means.
Occasional Contributor
Posts: 11

## Re: Modified Percentiles by Group

Read up on proc ranks. Aggregating, running proc ranks and then averaging by proc means should work. Thanks for the advise!

🔒 This topic is solved and locked.