BookmarkSubscribeRSS Feed
NKormanik
Barite | Level 11

Anyone out there with Proc HPSplit experience that can shed light on ANY of the below parameters, specific numbers used where the (?) is -- what worked for you, or what's to be avoided, and why?

 

How might one decide which parameters to use?  Here are the main ones:

 

cvmethod=random(?)
intervalbins=?
maxbranch=?
maxdepth=?
mincatsize=?
minleafsize=?
nsurrogates=?

 

grow ?
prune ?

 

 

I'm interested in what experiences you've had using Proc HPSplit.  How useful did you find it?

 

If not Proc HPSplit, what other did you find more to your liking?

 

Thanks!

 

Nicholas Kormanik

 

 

2 REPLIES 2
JosvanderVelden
SAS Super FREQ

Maybe the paper "An introduction to classification and regression trees with PROC HPSPLIT" has some useful information: http://www.mwsug.org/proceedings/2018/AA/MWSUG-2018-AA-42.pdf

You can also read this paper: https://www.lexjansen.com/wuss/2019/185_Final_Paper_PDF.pdf "PROC DTREE VS PROC HPSPLIT"

NKormanik
Barite | Level 11

Thanks for the leads, JosvanderVelden.

 

No specific advice on the parameters.  Curiously.

 

HPSplit comes with defaults.  Maybe those are 'good enough'?

 

Why change the defaults, unless one has good reason to.  Eh?

 

 

SAS Innovate 2025: Register Now

Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 2 replies
  • 634 views
  • 1 like
  • 2 in conversation