em_segment

NicolasC · Posted 11-02-2017 06:21 PM

Hi there

I created several scoring codes with sas miner that I am using on new datasets under sas guide. My target is interval (named amount). When I run the code(s), it creates several outputs (on top of the transformed variables IMP LOG etc...) such as P_amount, em_prediction, em_segment. P_amount and em_prediction seem to be the same, and correspond to the predicted value of amount for my new dataset. How is em_segment created and how does it relate to my predicted target value?

Thanks

WendyCzika · Posted 11-02-2017 08:46 PM

It can represent either clusters from the Clustering node or the leaf from a Decision Tree. There might be other nodes that produce it, but those are probably the two most common.

NicolasC · Posted 11-03-2017 02:33 AM

Thanks for your answer Wendy. Any of the models I use create these segments, whether they are decision trees or not. The number of segments (10, 20 etc...) are chosen (by me) in my model comparison node in SAS miner. I am still wondering what they mean with regards to my interval target (or predicted one), how are they formed precisely?

Thanks - Nicolas

WendyCzika · Posted 11-03-2017 03:57 PM

The bins (segments) are based on quantile binning of your predicted target (P_target).

NicolasC · Posted 11-06-2017 05:58 AM

Thanks Wendy for your answer. As far as I understood, these deciles (in case of score 1-10) are based on the validation data and the model keeps them as such. e.g when applying on a test database, they are not re-calculated as decile of the predicted target from the test data. I am afraid that this is quite dependent on the validation database, itself being a random selection of a larger database. Do you think this would make sense re-defining them as per the predicted target on each application?

Thanks

em_segment

Re: em_segment

Re: em_segment

Re: em_segment

Re: em_segment

Catch up on SAS Innovate 2026