@PurpleNinja wrote:
Hi Reeza,
You may be right. I may not want to break the ties.
My ultimate goal is to split the predicted probabilities from the random forest into equally large decile groups. Perhaps ties are not the problem, but I don't know what else is causing the imbalance in the sample sizes.
Let's ignore the ties for now. Given this problem with the macro for my predicted probabilities from the random forest, how should I solve it?
Thanks.
DO NOT CALL THEM DECILES THEN. They also cannot be used to appropriately measure the differences because the groups are not actually different and any metrics will be uninterpretable.
But, if you want 10 equal groups, use a data step and END option to specify the number of groups.
This doesn't work well if you have small data but if you have a lot of data and bigger groups it's fine.
%let n_groups=5;
data want;
set sashelp.class nobs=n_total;
retain group_size;
if _n_ = 1 then do;
group_size=floor(n_total/&n_groups);
group_index=1;
end;
if _n_ > group_size*group_index then
group_index+1;
if group_index > &n_groups then group_index=&n_groups.;
run;
... View more