BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
Viani_P
Calcite | Level 5

Hello, 

I would like to build a decision tree using proc hpsplit. My dataset has 13,000 observations and data are weighted, so I need to take the weights into account. Is it possible to add weight statement in proc hpsplit? I used ChatGPT to generate a potential SAS code and resulted in:

 

proc hpsplit data=sorted_data seed=12345 method=tree(splitrule=gini) maxdepth=3;
   class target;
   model target = var1 var2 var3 / weight=FINALWGT_LP; 
   output out=out_tree predicted=predicted_prob predlevel=predicted_level;
run;

 

When I ran this code, I got syntax error messages for the text in green.  However, after modifying the SAS code and removing all text in green, I was able to produce a tree (not sure how correct it is, but seemed to work). Unfortunately, this updated code was not taking weights into account anymore.

 

Can I incorporate the weight statement into proc hpsplit? OR is there another procedure better suited for decision trees taking into account data weights?

 

I appreciate any guidance!

1 ACCEPTED SOLUTION

Accepted Solutions
ballardw
Super User

HPSplit supports a WEIGHT statement.

proc hpsplit data=sorted_data seed=12345  maxdepth=3;
   class target;
   model target = var1 var2 var3 ; 
   output out=out_tree ;
   weight FINALWGT_LP; 
run;

If the weight variable is missing or non-positive (negative or zero) the observation will be excluded from the model.

 

I have no idea where CHAGGPT gets its rules from but it seems to have lots of problems with SAS syntax.

 

Check your online help for the procedure under Syntax to find basic options.

View solution in original post

2 REPLIES 2
ballardw
Super User

HPSplit supports a WEIGHT statement.

proc hpsplit data=sorted_data seed=12345  maxdepth=3;
   class target;
   model target = var1 var2 var3 ; 
   output out=out_tree ;
   weight FINALWGT_LP; 
run;

If the weight variable is missing or non-positive (negative or zero) the observation will be excluded from the model.

 

I have no idea where CHAGGPT gets its rules from but it seems to have lots of problems with SAS syntax.

 

Check your online help for the procedure under Syntax to find basic options.

Viani_P
Calcite | Level 5

thank you so much! this worked 🙂 

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 2 replies
  • 237 views
  • 3 likes
  • 2 in conversation