Hi Folks:
I'm trying to determine how many observations there are within each percentiles. The image below shows my current proc means output and the output that I want that includes the number of observations within each quantiles. Is there a way to accomplish Results Wanted?
PROC MEANS DATA=sashelp.stocks Q1 MEDIAN Q3 MAX MAXDEC=1;
CLASS STOCK;
VAR OPEN;
RUN;
Thank you for your time indeed!
Best wishes,
Cruise
1. Run PROC RANK to create the quartiles/percentiles of interest. The example below does the percentiles in groups of 10.
2. Run PROC FREQ to see the distributions.
proc rank data=sashelp.cars out=cars groups=10;
var mpg_city;
ranks rank_mpg_city;
run;
proc freq data=cars;
table rank_mpg_city;
run;
Does this work?
proc sql;
select STOCK
, min(OPEN) as MIN
, pctl(25,OPEN) as P25
, pctl(50,OPEN) as P50
, pctl(75,OPEN) as P75
, max(OPEN) as MAX
, sum(calculated MIN <= OPEN < calculated P25) as C1
, sum(calculated P25 <= OPEN < calculated P50) as C2
, sum(calculated P50 <= OPEN < calculated P75) as C3
, sum(calculated P75 <= OPEN <=calculated MAX) as C4
group by STOCK;
I'm curious about this approach. I got the error below. Trying to fix it but if you already know why i got this error, please let me know.
369 proc sql;
370 select STOCK
371 , min(OPEN) as MIN
372 , pctl(25,OPEN) as P25
373 , pctl(50,OPEN) as P50
374 , pctl(75,OPEN) as P75
375 , max(OPEN) as MAX
376 , sum(calculated MIN <= OPEN < calculated P25) as C1
377 , sum(calculated P25 <= OPEN < calculated P50) as C2
378 , sum(calculated P50 <= OPEN < calculated P75) as C3
379 , sum(calculated P75 <= OPEN <=calculated MAX) as C4
380 FROM sashelp.stocks
381 group by STOCK;
ERROR: Summary functions nested in this way are not supported.
ERROR: Summary functions nested in this way are not supported.
382 QUIT;
NOTE: The SAS System stopped processing this step because of errors.
NOTE: PROCEDURE SQL used (Total process time):
real time 0.00 seconds
cpu time
Sorry about this, I don't have SAS handy, so couldn't test.
Maybe this:
proc sql;
select quartiles.*
, sum( MIN <= OPEN < P25) as C1
, sum( P25 <= OPEN < P50) as C2
, sum( P50 <= OPEN < P75) as C3
, sum( P75 <= OPEN <= MAX) as C4
from (select STOCK
, min(OPEN) as MIN
, pctl(25,OPEN) as P25
, pctl(50,OPEN) as P50
, pctl(75,OPEN) as P75
, max(OPEN) as MAX
from SASHELP.STOCKS
group by STOCK ) quartiles
, SASHELP.STOCKS details
where quartiles.STOCK = details.STOCK
group by 1,2,3,4,5,6 ;
Two (nested) steps now, so probably slower than @Reeza 's solution.
Otherwise, replacing the calculated values with the calculation itself might work too, though speed may not benefit much either way.
1. Run PROC RANK to create the quartiles/percentiles of interest. The example below does the percentiles in groups of 10.
2. Run PROC FREQ to see the distributions.
proc rank data=sashelp.cars out=cars groups=10;
var mpg_city;
ranks rank_mpg_city;
run;
proc freq data=cars;
table rank_mpg_city;
run;
Thank you Reeza. If I use groups=3 in proc rank would it yield 25%, 50% and 75% as shown in the image below?
proc rank data=sashelp.cars out=cars groups=3;
var mpg_city;
ranks rank_mpg_city;
run;
proc freq data=cars;
table rank_mpg_city;
run;
Groups=3 means groups: 0-0.333 0.333-0.666 0.666-1
Thirtiles?
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.