BookmarkSubscribeRSS Feed
H
Pyrite | Level 9 H
Pyrite | Level 9

Hey community,

 

I have a sample that I am running through HPSPIT for a binary (one-split) decision tree. I was planning to run a bunch of bootstrap versions of the set through the procedure and record what the value it is splitting on for the single continuous predictor. I created a reproachable example below.

 

Issue, PROC HPSPLIT seems not to have a BY statement option. I suppose a work around is using wrap around piece of code that feeds the Replicate value into a where statement. I could probably figure that out, but I am a realist  and know it would take me over an hour to get it to work, so I am soliciting you all's help to write it.

 

Second issue, since you all are very savvy, I was also going to see if you could figure out a way to pull out the split value for each bootstrap sample. The current code just has B=5, but say if I bump it up to 50, could you all help create an automated way to pull out all of the split values and put them into a working SAS file. Let me know if you have questions!!

 

proc surveyselect data=sashelp.heart out=heart_boot NOPRINT 
        seed=1                                         
	method=urs 
	samprate=1                          
        outhits
	rep = 5; 
run;


data heart_boot_trim;
	set heart_boot;
	keep Replicate status ageatstart;
run;

ods graphics on;
ods trace on;
proc hpsplit data=heart_boot_trim maxdepth=1 ;
/*	by Replicate; */  /*doesn't work*/
        class Status;
        model Status (event='Dead') = AgeAtStart;
       prune costcomplexity;
       partition fraction(validate=0.3 seed=1234);
       rules file='rules.txt';
run;
ods graphics off;
ods trace off;
1 REPLY 1
H
Pyrite | Level 9 H
Pyrite | Level 9

Well I put the macro wrapper around it so it runs the different bootstrap samples, but still looking for a way to pull out the split value on the age variable. So for the below code I would like the split value for the 5 generated trees - ideally placed into a file. As you can see, the code outputs a file with the tree steps, but given my code overwrites the outputted file. Note sure if getting it to output 5 files and scrapping the values out of those is a good idea or if ODS TRACE can be used to find the pieces.

 

Thanks!

proc surveyselect data=sashelp.heart out=heart_boot NOPRINT 
     seed=1                                         
     method=urs 
     samprate=1                          
     outhits
     rep = 5; 
run;

ods graphics on;
data heart_boot_trim;
	set heart_boot;
	keep Replicate status ageatstart;
run;

ods graphics on;
%macro run_model;
     %do i=1 %to 5;
proc hpsplit data=heart_boot_trim maxdepth=1 ;
	where replicate = &i;
/*	by Replicate; */
   class Status;
   model Status (event='Dead') = AgeAtStart;
   prune costcomplexity;
   partition fraction(validate=0.3 seed=1234);
   rules file='rules&&i.txt';
run;
	%end;
	%mend run_model;
%run_model;

 

 

So the desired file would have something like:

Sample Split_Value

1           45.340

2           47.150

3           45.170

4           46.020

5           45.340

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 1 reply
  • 867 views
  • 0 likes
  • 1 in conversation