Hello
With PROC HPSPLIT there are some options for dealing with dichotomous outcome variables that are very unbalanced. But what if the outcome has 3 (or more) levels and they are unbalanced? I could not find any options to deal with this. For instance, using SAS 9.4 on Windows I did this:
data new;
set sashelp.bweight;
count + 1;
if weight < 1500 then bwcat = "1: Very low";
else if weight < 2500 then bwcat = "2: low";
else bwcat = "3: Normal";
run;
and then
proc hpsplit data = new seed = 123;
class black boy married momedlevel momsmoke bwcat;
model bwcat = black boy married momedlevel momsmoke momage momwtgain visit cigsperday;
output out=hpsplout;
run;
the result is not good. None of the very low BW babies are correctly classified, and less than 2% of the low BW babies are correctly classified. For a dichotomous outcome, we can play with the sensitivity level in scoring, but that has no real analogue here.
Any thoughts or suggestions are welcome.
Peter