Anyone out there with Proc HPSplit experience that can shed light on ANY of the below parameters, specific numbers used where the (?) is -- what worked for you, or what's to be avoided, and why?
How might one decide which parameters to use? Here are the main ones:
cvmethod=random(?)
intervalbins=?
maxbranch=?
maxdepth=?
mincatsize=?
minleafsize=?
nsurrogates=?
grow ?
prune ?
I'm interested in what experiences you've had using Proc HPSplit. How useful did you find it?
If not Proc HPSplit, what other did you find more to your liking?
Thanks!
Nicholas Kormanik
Maybe the paper "An introduction to classification and regression trees with PROC HPSPLIT" has some useful information: http://www.mwsug.org/proceedings/2018/AA/MWSUG-2018-AA-42.pdf
You can also read this paper: https://www.lexjansen.com/wuss/2019/185_Final_Paper_PDF.pdf "PROC DTREE VS PROC HPSPLIT"
Thanks for the leads, JosvanderVelden.
No specific advice on the parameters. Curiously.
HPSplit comes with defaults. Maybe those are 'good enough'?
Why change the defaults, unless one has good reason to. Eh?
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.
Find more tutorials on the SAS Users YouTube channel.