BookmarkSubscribeRSS Feed
ik84135
Fluorite | Level 6

Hello!

 

I am using Enterprise  Guide 7.1. I try to develop some decision trees by using hpsplit procedure. It works properly on one data set and decision tree is generated but error occured on other which has the same variables (just different values and number of observations). What could be a cause of this error?

 

Thanks 🙂

4 REPLIES 4
PaigeMiller
Diamond | Level 26

The error message seems to explain the problem

 

Re: ERROR: HPSPLIT found too many target levels to execute properly.

 

How many distinct levels are there in your target variable?

--
Paige Miller
ik84135
Fluorite | Level 6

The target variable is a continous one so for example  one data set has 848 observations where target variable has 748 distinct values - there error occured. On the other data set is 483 observations and 453 distinct values of target variable and it works properly. It is possible to avoid this error somehow?

 

Thanks

ballardw
Super User

It may help to provide the code implemented.

 

Perhaps this from the documentation of the procedure provides a clue:

Memory Considerations

The HPSPLIT procedure is designed for high-performance computing. As a result, it does not create utility files but rather stores all the data in memory. Data sets that have a large number of predictor variables and a large number of response levels can cause PROC HPSPLIT to run out of memory. One way to overcome this problem is to give SAS more memory to use. Another way is to use fewer threads, which reduces the memory that is required. You can use the NTHREADS= option in the PERFORMANCE statement to specify the number of threads. For more information, see the section PERFORMANCE Statement.

 

The combination of the number of target variable levels and predictor variables affect the memory usage. So the set with fewer target variable levels didn't have the issue but the other does.

 

 

Options actually used may allow more specific suggestions.

ik84135
Fluorite | Level 6

Could you explain how this nthreads option excactly works and what value I should put there? I tried different combinations but it didn't help or maybe any other solutions ? Error occures immidiately after running the code:

proc hpsplit data=table_2 seed= 123 assignmissing=branch cvmodelfit maxdepth=8 maxbranch=2 LEAFSIZE=10;

class target_var ;

model target_var = 


var1

var2

var3

;

grow chaid;

performance nthreads = 5;

prune costcomplexity (leaves = 6);

output out=tree1;

run;

 

 

Thanks

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!

Submit your idea!

Mastering the WHERE Clause in PROC SQL

SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 4 replies
  • 1535 views
  • 2 likes
  • 3 in conversation