turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

Find a Community

- Home
- /
- SAS Programming
- /
- SAS Procedures
- /
- How to get a summary for the top 5%, 10% and other...

Topic Options

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

07-26-2011 12:55 PM

I have a sample with one dependent dummary variable y (0, 1). The data has been sorted.

Now the question is how I can get easily get a summary (e.g. mean, freq) for the top 5%, 10%, 15%, 20%, 25%, ... 95% sample. I hope there is a way to do this easily, rather than use separate steps to produce the results. Any procedure to do this, PROC UNIVARIATE, PROC FREQ, PROC MEANS?

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to bncoxuk

07-26-2011 01:00 PM

You have to tell us more. If you really do only have one dv, and it IS binary, I would think that just proc freq would provide everything that you need. If it is more complex, proc rank might be what you are looking for.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to bncoxuk

07-26-2011 01:27 PM

If the variable only has 2 levels it cannot be divided into 20 groups.

If you have a continuous variable then PROC RANK will divide the data into 20 groups and you can summarize using the GROUP as a class variable..

**proc** **rank** data=sashelp.heart out=g groups=**20**;

var weight;

ranks g;

**run**;

**proc** **means** data=g;

class g;

var weight;

**run**;

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to data_null__

07-26-2011 01:42 PM

Sorry that I did not explain clearly.

What I want is to get the descriptive statistics for the top 5% data, top10% data, top 15%... Here, the data have been sorted in a proper order. The descriptive statistics are just for a single variable (which is a dummy type y=0 or 1).

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to bncoxuk

07-26-2011 01:48 PM

Just create a grouping variable based on obs number and total obs that makes 20 groups.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to bncoxuk

07-26-2011 03:06 PM

and you may want to look into insight. If you license it, you can activate it by either runing the word insight from the command line, or running the command DM "insight";

I have found it to be an invaluable way to see both the statistics, and graphical representations, of one's data.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to art297

07-26-2011 03:21 PM

Thanks art297. I learned

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to bncoxuk

08-02-2011 11:59 PM

data test; do index=1 to 100; y=int(ranuni(-1)*2); output; end; run; proc sort data=test ; by descending index; run; %let dsid=%sysfunc(open(test)); %let nobs=%sysfunc(attrn(&dsid,nobs)); %let dsid=%sysfunc(close(&dsid)); %let want_percent=.2; %let want_top=%sysevalf(&nobs*&want_percent,integer); %put _user_; proc means data=test(obs=&want_top) nway noprint; var y; output out=want mean= sum=/autoname; run;

Ksharp