Ajosh, Let us focus on your original question regarding boosting, specifically, this portion " I later used the output dataset from the end group nodes by merging the training and validation dataset which has the entire dataset. However, I observed that for every node number, the predicted probability of target = Y is not the same throughout the records which have the same node number. Also the range of these predicted probabilities overlap across node numbers. Also, the output from the end groups result window shows around 60% true positive rate and 70% true negative rate, which means some good amount of classification is happening due to boosting approach. My end objective is to derive patterns/if then rules from such a dataset. Is anyone aware of how can this be accomplished (is there any other node that needs to be used on the exported dataset of end group node and so on)?? " 1. For the first portion, You merged the training and validation data sets (You select the end group node, and went to Exported Data button to find your data sets underneath, right?). The model trained and the model validated physically are two different ones, although logically the same one, since one is the validated, balanced version of the other. Wonder if you can just look at either one of them at a time. Whether it is to report model performance, or extract score code, analytically you should stick with what comes off the validation. It is indeed a good practice to try to minimize difference between training and validation data sets. In other words, if the gap is big and varies from attempt to attempt, you may consider training 'better' to close the gap. Also keep on to see if performance off the validation data set is improving; or at least is stable. 2. As for the rule, please disregard my previous remark surrounding. That was largely correct but I thought you were doing SGB. I checked and don't see any major difference on this subject between EM7.1 and EM12.3 (the two user guides appear largely the same on the group processing). So I am going to use EM12.3 to speak about EM 7.1 on this subject. For the End Group processing, if you go check for Flow Code and Score Code, it has anything but Flow Code and Score Code for group processing, unlike SGB. I can expand quite a bit on this. I want to keep it on focus on what you want. If you can clarify a bit about why you need "derive patterns/if then rules from such a dataset." For example, are you trying to port it to elsewhere to score, or just to study further on the mechanics of the boosting process? I agree with Reeza for your follow-up questions on cut-off... you may get better response performance if you can post them as separate questions. Best Regards Jason Xin
... View more