turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

Find a Community

- Home
- /
- SAS Programming
- /
- SAS Procedures
- /
- Proc Means for determining number of observations ...

Topic Options

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

10-06-2014 06:15 PM

Hi Everyone

I would like to determine the number of observations that fall within 10, 25, 50, 75, 90% of a population. I thought that a Proc Means statement would do this since I use it to calculate quartiles. However I was wondering if this was the case? Variations of the below Proc Means do not seem to work.

Paul

proc means data=test1 noprint missing;

var TprFilingToIssueJoined;

by County TprFileYear;

output out=tprpercents n= nmiss= p10 p25 p50 p75 p90 p95 /autoname;

run;

Accepted Solutions

Solution

10-06-2014
09:24 PM

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

10-06-2014 09:24 PM

Proc means works you forgot the = sign after in your original code. There should have been a warning or something though

If it was equal divisions can look at proc rank, though you could break it into 20 and regroup the data as well. probably easier to code.

proc means data=test1 noprint missing;

var TprFilingToIssueJoined;

by County TprFileYear;

output out=tprpercents n= nmiss= p10= p25= p50= p75= p90= p95= /autoname;

run;

All Replies

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

10-06-2014 06:25 PM

That doesn't make much sense analytically.

10% of your data is in the 10th percentile by definition, i.e. n*0.1

25% of your data is in the 25th percentile i.e. n*0.25

+/- 1 usually.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

10-06-2014 07:12 PM

I use proc univariate for this:

title3 "make test data";

data test(drop=_;

do _i = 1 to 137;

xyz = int(ranuni(_i)*1234);

output;

end;

run;

title3 "Get deciles with PCTLPTS= option";

proc univariate data=test noprint;

var xyz;

output out=deciles pctlpts=10 25 50 75 90 pctlpre=P;

run;

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

10-06-2014 08:47 PM

I don't have SAS with me now, but I think Proc Univariate could work. I am trying to segment a data set (population) into smaller sub-populations based on the time (in days) it takes to achieve adjudication.

The population divisions would be percentages of the population who achieved adjudication the fastest: 10%, 25%, 50%, 75%, 90%. I need to know which time represents each of these points and then subdivide the population observations into each segment: 0-10, 11-25, etc.

So Proc Univariate appears to be a way to at least identify the observations that represent each percentile. Then I would assume I could segment the population using these values.

Paul

Solution

10-06-2014
09:24 PM

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

10-06-2014 09:24 PM

Proc means works you forgot the = sign after in your original code. There should have been a warning or something though

If it was equal divisions can look at proc rank, though you could break it into 20 and regroup the data as well. probably easier to code.

proc means data=test1 noprint missing;

var TprFilingToIssueJoined;

by County TprFileYear;

output out=tprpercents n= nmiss= p10= p25= p50= p75= p90= p95= /autoname;

run;

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

10-07-2014 10:28 AM

Why not use proc rank , assign it with group=100 ?