Hi Everyone
I would like to determine the number of observations that fall within 10, 25, 50, 75, 90% of a population. I thought that a Proc Means statement would do this since I use it to calculate quartiles. However I was wondering if this was the case? Variations of the below Proc Means do not seem to work.
Paul
proc means data=test1 noprint missing;
var TprFilingToIssueJoined;
by County TprFileYear;
output out=tprpercents n= nmiss= p10 p25 p50 p75 p90 p95 /autoname;
run;
Proc means works you forgot the = sign after in your original code. There should have been a warning or something though :S
If it was equal divisions can look at proc rank, though you could break it into 20 and regroup the data as well. probably easier to code.
proc means data=test1 noprint missing;
var TprFilingToIssueJoined;
by County TprFileYear;
output out=tprpercents n= nmiss= p10= p25= p50= p75= p90= p95= /autoname;
run;
That doesn't make much sense analytically.
10% of your data is in the 10th percentile by definition, i.e. n*0.1
25% of your data is in the 25th percentile i.e. n*0.25
+/- 1 usually.
I use proc univariate for this:
title3 "make test data";
data test(drop=_:);
do _i = 1 to 137;
xyz = int(ranuni(_i)*1234);
output;
end;
run;
title3 "Get deciles with PCTLPTS= option";
proc univariate data=test noprint;
var xyz;
output out=deciles pctlpts=10 25 50 75 90 pctlpre=P;
run;
I don't have SAS with me now, but I think Proc Univariate could work. I am trying to segment a data set (population) into smaller sub-populations based on the time (in days) it takes to achieve adjudication.
The population divisions would be percentages of the population who achieved adjudication the fastest: 10%, 25%, 50%, 75%, 90%. I need to know which time represents each of these points and then subdivide the population observations into each segment: 0-10, 11-25, etc.
So Proc Univariate appears to be a way to at least identify the observations that represent each percentile. Then I would assume I could segment the population using these values.
Paul
Proc means works you forgot the = sign after in your original code. There should have been a warning or something though :S
If it was equal divisions can look at proc rank, though you could break it into 20 and regroup the data as well. probably easier to code.
proc means data=test1 noprint missing;
var TprFilingToIssueJoined;
by County TprFileYear;
output out=tprpercents n= nmiss= p10= p25= p50= p75= p90= p95= /autoname;
run;
Why not use proc rank , assign it with group=100 ?
Don't miss out on SAS Innovate - Register now for the FREE Livestream!
Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.