Hi there
I created several scoring codes with sas miner that I am using on new datasets under sas guide. My target is interval (named amount). When I run the code(s), it creates several outputs (on top of the transformed variables IMP LOG etc...) such as P_amount, em_prediction, em_segment. P_amount and em_prediction seem to be the same, and correspond to the predicted value of amount for my new dataset. How is em_segment created and how does it relate to my predicted target value?
Thanks
It can represent either clusters from the Clustering node or the leaf from a Decision Tree. There might be other nodes that produce it, but those are probably the two most common.
Thanks for your answer Wendy. Any of the models I use create these segments, whether they are decision trees or not. The number of segments (10, 20 etc...) are chosen (by me) in my model comparison node in SAS miner. I am still wondering what they mean with regards to my interval target (or predicted one), how are they formed precisely?
Thanks - Nicolas
The bins (segments) are based on quantile binning of your predicted target (P_target).
Thanks Wendy for your answer. As far as I understood, these deciles (in case of score 1-10) are based on the validation data and the model keeps them as such. e.g when applying on a test database, they are not re-calculated as decile of the predicted target from the test data. I am afraid that this is quite dependent on the validation database, itself being a random selection of a larger database. Do you think this would make sense re-defining them as per the predicted target on each application?
Thanks
It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.
Use this tutorial as a handy guide to weigh the pros and cons of these commonly used machine learning algorithms.
Find more tutorials on the SAS Users YouTube channel.
