BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Paul_NYS
Obsidian | Level 7

Hi Everyone

I would like to determine the number of observations that fall within 10, 25, 50, 75, 90% of a population. I thought that a Proc Means statement would do this since I use it to calculate quartiles. However I was wondering if this was the case? Variations of the below Proc Means do not seem to work.

Paul

proc means data=test1 noprint missing;

var TprFilingToIssueJoined;

by County TprFileYear;

output out=tprpercents n= nmiss= p10 p25 p50 p75 p90 p95 /autoname;

run;

1 ACCEPTED SOLUTION

Accepted Solutions
Reeza
Super User

Proc means works you forgot the = sign after in your original code. There should have been a warning or something though :S

If it was equal divisions can look at proc rank, though you could break it into 20 and regroup the data as well. probably easier to code.

proc means data=test1 noprint missing;

var TprFilingToIssueJoined;

by County TprFileYear;

output out=tprpercents n= nmiss= p10= p25= p50= p75= p90= p95= /autoname;

run;

View solution in original post

5 REPLIES 5
Reeza
Super User

That doesn't make much sense analytically.

10% of your data is in the 10th percentile by definition, i.e. n*0.1

25% of your data is in the 25th percentile i.e. n*0.25

+/- 1 usually.

Orsini
Fluorite | Level 6

I use proc univariate for this:

title3 "make test data";                             

data test(drop=_:);                                  

   do _i = 1 to 137;                                  

     xyz = int(ranuni(_i)*1234);                      

     output;                                          

   end;                                               

run;                                                 

                                                      

title3 "Get deciles with PCTLPTS= option";           

proc univariate data=test noprint;                   

   var xyz;                                           

   output out=deciles pctlpts=10 25 50 75 90 pctlpre=P;

run;                                                

Paul_NYS
Obsidian | Level 7

I don't have SAS with me now, but I think Proc Univariate could work. I am trying to segment a data set (population) into smaller sub-populations based on the time (in days) it takes to achieve adjudication.

The population divisions would be percentages of the population who achieved adjudication the fastest: 10%, 25%, 50%, 75%, 90%. I need to know which time represents each of these points and then subdivide the population observations into each segment: 0-10, 11-25, etc.

So Proc Univariate appears to be a way to at least identify the observations that represent each percentile. Then I would assume I could segment the population using these values.

Paul

Reeza
Super User

Proc means works you forgot the = sign after in your original code. There should have been a warning or something though :S

If it was equal divisions can look at proc rank, though you could break it into 20 and regroup the data as well. probably easier to code.

proc means data=test1 noprint missing;

var TprFilingToIssueJoined;

by County TprFileYear;

output out=tprpercents n= nmiss= p10= p25= p50= p75= p90= p95= /autoname;

run;

Ksharp
Super User

Why not use proc rank , assign it with group=100 ?

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 5 replies
  • 2311 views
  • 6 likes
  • 4 in conversation