How do you know which ones go to which group? At that rate how is it different than random? You definitely can't call it quartiles...
EDIT: My mistake, it won't matter since thats the group you're summarizing. But don't call it quartiles, that would be misleading.
proc sort data=sashelp.cars out=cars;
by mpg_highway;
run;
data check;
set cars nobs=num;
n_group=floor(_n_/(num/4));
if n_group=4 then n_group=3; *reassign last obs;
run;
proc means data=check;
class n_group;
var mpg_highway;
run;
Use PROC RANK to form the quartiles, as shown in the article "Grouping observations based on quantiles."
Then use the CLASS statement in PROC MEANS to compute the statistics for each quantile, as shown in this example:
%let NumGroups = 4;
proc rank data=Sashelp.cars out=Want groups=&NumGroups ties=high;
var MSRP; /* variable on which to group */
ranks Group; /* name of variable to contain groups 0,1,...,k-1 */
run;
proc format; /* display 0-3 as Q1-Q3 */
value Quartile 0="Q1" 1="Q2" 2="Q3" 3="Q4";
run;
proc means data=Want N MEAN STD;
var MSRP;
class Group;
format group Quartile.;
run;
I hope next link will help you:
http://support.sas.com/resources/papers/proceedings10/135-2010.pdf
How do you know which ones go to which group? At that rate how is it different than random? You definitely can't call it quartiles...
EDIT: My mistake, it won't matter since thats the group you're summarizing. But don't call it quartiles, that would be misleading.
proc sort data=sashelp.cars out=cars;
by mpg_highway;
run;
data check;
set cars nobs=num;
n_group=floor(_n_/(num/4));
if n_group=4 then n_group=3; *reassign last obs;
run;
proc means data=check;
class n_group;
var mpg_highway;
run;
Your code is helpful. why are you reassigning the last observation?
Yes, unfortunately this problem can occur. See the article "Binning data by quantiles? Beware of rounded data." The article says that is there are N observations and k groups then "if there are more than N/k repeated values, the repeated value can occupy more than one quantile value. In fact, this will always happen if a particular value is repeated more than 2N/k times."
The article concludes with this warning: "Beware of using quantiles to bin rounded data into groups. Although the technique works great when almost all of the data values are distinct, you can run into problems if you ask for many bins and your data contain many repeated values."
If you want your ties to be randomly split between quartiles, use small random tie breakers:
data cars;
set sashelp.cars;
/* Add random noise smaller than data precision */
qtr_mpg_highway = mpg_highway + 0.00001*rand("uniform");
run;
proc rank data=cars out=check groups=4;
var qtr_mpg_highway;
run;
proc means data=check;
class qtr_mpg_highway;
var mpg_highway;
run;
_n_ is a pseudo row number, that I'm using as a row number. It's an automatic variable.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.