BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
dmz2000
Calcite | Level 5

 Thanks in advance for the help. I have a dataset that looks like the below, with a large number of simulated scenario results by Group and Subjects.

 

I am trying to calculate a modified percentile by Group or Subject. By modified I mean:

- A percentile (e.g., 99.5th) by Group or Subject across all scenarios

- An average of the 50 observations around the specified (e.g., 99.5th) percentile to be used in place of the single datapoint corresponding to the percentile; or even like avg of the observations that fall within the 99.25th and 99.75th percentile 

 

How can be achieved?

 

ScenariosGroupsSubjectResults
1ATom100
1ATim110
1ATed120
1BBob110
1BBen100
1BBill90
1
2ATom105
2ATim115
2ATed125
2BBob115
2BBen105
2BBill95
2
1 ACCEPTED SOLUTION

Accepted Solutions
Reeza
Super User
I don't think so but it wouldn't be hard to manually accomplish. You can use proc rank to assign 1000 ranks to each group and then compute the average within a proc means.

View solution in original post

5 REPLIES 5
Reeza
Super User
How much data do you have? You'll need a lot of observations PER GROUP/SUBJECT to differentiate the 99.25 from the 99.75th percentile.
ballardw
Super User

You may be looking for a Trimmed Mean.  Proc univariate will do trimmed means using the TRIM option. But I think you need to consider what you mean by "within the 99.25th and 99.75th percentile". How many records do you have for any group or subject? If there are 100 then you do not have any in that range as they would go from 99th to 100th percentile.

 

Can you provide what you would expect for the output of the example data you provided?

dmz2000
Calcite | Level 5

The dataset is fairly large, with results for about 10,000 scenarios, each with 7 groups having 25,000 subjects in each group. So total row size would be about 10k x 7 x 25k. 

 

I would be calculating the percentile (99.5th) statistics for:

  • Aggregate results (all groups and subjects) across the 10,000 scenarios
  • Aggregate for a particular group across the scenarios
  • A particular subject across the scenarios

By the modified percentile, the output I am expecting would be, using subject Tom as example:
If the results for Tom across all scenarios are sorted {1 [smallest], 2, 3, 4... 9925,... 9975... 10000th [largest])

 

My modified 99.5th percentile results for Tom would be the average of all observations between 9925 and 9975th observations.

 

Is there such a procedure within SAS that can calculate such modified percentile by group? 

Reeza
Super User
I don't think so but it wouldn't be hard to manually accomplish. You can use proc rank to assign 1000 ranks to each group and then compute the average within a proc means.
dmz2000
Calcite | Level 5

Read up on proc ranks. Aggregating, running proc ranks and then averaging by proc means should work. Thanks for the advise!

SAS Innovate 2025: Register Now

Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 5 replies
  • 2031 views
  • 0 likes
  • 3 in conversation