BookmarkSubscribeRSS Feed
NKormanik
Barite | Level 11

Anyone out there with Proc HPSplit experience that can shed light on ANY of the below parameters, specific numbers used where the (?) is -- what worked for you, or what's to be avoided, and why?

 

How might one decide which parameters to use?  Here are the main ones:

 

cvmethod=random(?)
intervalbins=?
maxbranch=?
maxdepth=?
mincatsize=?
minleafsize=?
nsurrogates=?

 

grow ?
prune ?

 

 

I'm interested in what experiences you've had using Proc HPSplit.  How useful did you find it?

 

If not Proc HPSplit, what other did you find more to your liking?

 

Thanks!

 

Nicholas Kormanik

 

 

2 REPLIES 2
JosvanderVelden
SAS Super FREQ

Maybe the paper "An introduction to classification and regression trees with PROC HPSPLIT" has some useful information: http://www.mwsug.org/proceedings/2018/AA/MWSUG-2018-AA-42.pdf

You can also read this paper: https://www.lexjansen.com/wuss/2019/185_Final_Paper_PDF.pdf "PROC DTREE VS PROC HPSPLIT"

NKormanik
Barite | Level 11

Thanks for the leads, JosvanderVelden.

 

No specific advice on the parameters.  Curiously.

 

HPSplit comes with defaults.  Maybe those are 'good enough'?

 

Why change the defaults, unless one has good reason to.  Eh?

 

 

hackathon24-white-horiz.png

The 2025 SAS Hackathon Kicks Off on June 11!

Watch the live Hackathon Kickoff to get all the essential information about the SAS Hackathon—including how to join, how to participate, and expert tips for success.

YouTube LinkedIn

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 2 replies
  • 847 views
  • 1 like
  • 2 in conversation