BookmarkSubscribeRSS Feed
bncoxuk
Obsidian | Level 7

I have a sample with one dependent dummary variable y (0, 1). The data has been sorted.

Now the question is how I can get easily get a summary (e.g. mean, freq) for the top 5%, 10%, 15%, 20%, 25%, ... 95% sample. I hope there is a way to do this easily, rather than use separate steps to produce the results. Any procedure to do this, PROC UNIVARIATE, PROC FREQ, PROC MEANS? 

7 REPLIES 7
art297
Opal | Level 21

You have to tell us more.  If you really do only have one dv, and it IS binary, I would think that just proc freq would provide everything that you need.  If it is more complex, proc rank might be what you are looking for.

data_null__
Jade | Level 19

If the variable only has 2 levels it cannot be divided into 20 groups.

If you have a continuous variable then PROC RANK will divide the data into 20 groups and you can summarize using the GROUP as a class variable..

proc rank data=sashelp.heart out=g groups=20;

   var weight;

   ranks g;

   run;

proc means data=g;

   class g;

   var weight;

   run;

bncoxuk
Obsidian | Level 7

Sorry that I did not explain clearly.

What I want is to get the descriptive statistics for the top 5% data, top10% data, top 15%... Here, the data have been sorted in a proper order. The descriptive statistics are just for a single variable (which is a dummy type y=0 or 1).

data_null__
Jade | Level 19

Just create a grouping variable based on obs number and total obs that makes 20 groups.

art297
Opal | Level 21

and you may want to look into insight.  If you license it, you can activate it by either runing the word insight from the command line, or running the command DM "insight";

I have found it to be an invaluable way to see both the statistics, and graphical representations, of one's data.

bncoxuk
Obsidian | Level 7

Thanks art297. I learned Smiley Happy

Ksharp
Super User
data test;
 do index=1 to 100;
  y=int(ranuni(-1)*2);
  output;
 end;
run;
proc sort data=test ;
 by descending index;
run;

%let dsid=%sysfunc(open(test));
%let nobs=%sysfunc(attrn(&dsid,nobs));
%let dsid=%sysfunc(close(&dsid));

%let want_percent=.2;
%let want_top=%sysevalf(&nobs*&want_percent,integer);
%put _user_;

proc means data=test(obs=&want_top) nway noprint;
 var y;
 output out=want mean= sum=/autoname;
run;

Ksharp

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 7 replies
  • 2846 views
  • 6 likes
  • 4 in conversation