I'm making a decision tree in SAS EMiner. I have the data source, then the data partition node, and then a filter node before the decision tree node. When I try a decision tree interactively, some of the splits aren't making sense to me. For example I have one variable that is nominal and can be either "S" or "U". When the tree chooses this variable to split on it has two branches: "S" and "Missing Values Only". None of the entries are missing this variable. I thought that maybe the "U"'s were being filtered out by default because there were too few of them, but when I changed the "Default Filtering Method" to "None" the results did not change. I tried to change the seed number in the data partition node, but that didn't seem to help either. Does anybody know why this might be happening, or how I can fix it?
Thanks,
My experience is that when SAS says there are missing values and the user says there are no missing values, I believe SAS. So you need to look at your data carefully, or do a PROC FREQ on this particular variable (or whatever the equivalent is in Enterprise Miner).
It's not that SAS is saying there are some missing values, it is only identifying the "S"'s and the rest as "Missing", so I'm not sure what's happening to the "U"'s. I don't know what what PROC FREQ is, I'm new to SAS and have been working primarily with Enterprise Miner.
Still, you need to look at your data set somehow and see exactly what is in there.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Use this tutorial as a handy guide to weigh the pros and cons of these commonly used machine learning algorithms.
Find more tutorials on the SAS Users YouTube channel.