PI am using Enterprise Miner (EM) Decision Trees to investigate a wide variety of models. To simplify things I have two models using the same dataset. I would like to use the same data and same data partition for both model runs. The only difference between the two models are a different Target variable for each.
It was highly recommended to set the Drop specifier to Y rather than set the Level specifier to Rejected. But the problem is that when I do this the variable literally gets rejected from my EM run and I cannot choose to change the Target variable to another one when I do my second model.
I hope that makes sense. Any recommendations?
Also, there are some variables that I want to still be included within my EM dataset for posterity, but there is no way I will use it within the model itself. So basically my variables are either Level = Target, Input, or Key. Can anyone suggest the best way to code these up so that there are no future problems down the line? I suppose I can set them to INPUT then set Drop = Y, and call it a day.
Thank you.
If I'm understanding correctly, I don't think you want to use Drop=Y in either of these situations. That is removing the variable from the data set for all subsequent nodes. For the two target situation, as long as you are not using Target1 as an input when modeling Target2 and vice-versa, you can set them both as Target variables, then in the modeling nodes, set Use=Yes to the one you want to use as the target and Use=No to the other. If you do want to use them as inputs when modeling the other one, you can use a Metadata node to change the two targets between target/input and input/target, if that makes sense.
For the other situation where you want to keep variables but not include them in the model, you can either set them as Role=Rejected to begin with, or as Role=Input with Use=No in whatever nodes you want to exclude them from.
If I'm understanding correctly, I don't think you want to use Drop=Y in either of these situations. That is removing the variable from the data set for all subsequent nodes. For the two target situation, as long as you are not using Target1 as an input when modeling Target2 and vice-versa, you can set them both as Target variables, then in the modeling nodes, set Use=Yes to the one you want to use as the target and Use=No to the other. If you do want to use them as inputs when modeling the other one, you can use a Metadata node to change the two targets between target/input and input/target, if that makes sense.
For the other situation where you want to keep variables but not include them in the model, you can either set them as Role=Rejected to begin with, or as Role=Input with Use=No in whatever nodes you want to exclude them from.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Use this tutorial as a handy guide to weigh the pros and cons of these commonly used machine learning algorithms.
Find more tutorials on the SAS Users YouTube channel.