Help using Base SAS procedures

Modified Percentiles by Group

Accepted Solution Solved
Reply
Occasional Contributor
Posts: 11
Accepted Solution

Modified Percentiles by Group

 Thanks in advance for the help. I have a dataset that looks like the below, with a large number of simulated scenario results by Group and Subjects.

 

I am trying to calculate a modified percentile by Group or Subject. By modified I mean:

- A percentile (e.g., 99.5th) by Group or Subject across all scenarios

- An average of the 50 observations around the specified (e.g., 99.5th) percentile to be used in place of the single datapoint corresponding to the percentile; or even like avg of the observations that fall within the 99.25th and 99.75th percentile 

 

How can be achieved?

 

ScenariosGroupsSubjectResults
1ATom100
1ATim110
1ATed120
1BBob110
1BBen100
1BBill90
1
2ATom105
2ATim115
2ATed125
2BBob115
2BBen105
2BBill95
2

Accepted Solutions
Solution
‎10-26-2015 02:55 PM
Super User
Posts: 17,829

Re: Modified Percentiles by Group

I don't think so but it wouldn't be hard to manually accomplish. You can use proc rank to assign 1000 ranks to each group and then compute the average within a proc means.

View solution in original post


All Replies
Super User
Posts: 17,829

Re: Modified Percentiles by Group

How much data do you have? You'll need a lot of observations PER GROUP/SUBJECT to differentiate the 99.25 from the 99.75th percentile.
Super User
Posts: 10,500

Re: Modified Percentiles by Group

You may be looking for a Trimmed Mean.  Proc univariate will do trimmed means using the TRIM option. But I think you need to consider what you mean by "within the 99.25th and 99.75th percentile". How many records do you have for any group or subject? If there are 100 then you do not have any in that range as they would go from 99th to 100th percentile.

 

Can you provide what you would expect for the output of the example data you provided?

Occasional Contributor
Posts: 11

Re: Modified Percentiles by Group

The dataset is fairly large, with results for about 10,000 scenarios, each with 7 groups having 25,000 subjects in each group. So total row size would be about 10k x 7 x 25k. 

 

I would be calculating the percentile (99.5th) statistics for:

  • Aggregate results (all groups and subjects) across the 10,000 scenarios
  • Aggregate for a particular group across the scenarios
  • A particular subject across the scenarios

By the modified percentile, the output I am expecting would be, using subject Tom as example:
If the results for Tom across all scenarios are sorted {1 [smallest], 2, 3, 4... 9925,... 9975... 10000th [largest])

 

My modified 99.5th percentile results for Tom would be the average of all observations between 9925 and 9975th observations.

 

Is there such a procedure within SAS that can calculate such modified percentile by group? 

Solution
‎10-26-2015 02:55 PM
Super User
Posts: 17,829

Re: Modified Percentiles by Group

I don't think so but it wouldn't be hard to manually accomplish. You can use proc rank to assign 1000 ranks to each group and then compute the average within a proc means.
Occasional Contributor
Posts: 11

Re: Modified Percentiles by Group

Read up on proc ranks. Aggregating, running proc ranks and then averaging by proc means should work. Thanks for the advise!

☑ This topic is SOLVED.

Need further help from the community? Please ask a new question.

Discussion stats
  • 5 replies
  • 371 views
  • 0 likes
  • 3 in conversation