09-24-2013 07:09 AM
Currently constructing a predictive modelling data layer. Just wondering if a layer is allowed to have duplicate payments? i.e. same id, but the payment is binned twice into two differing bins. Initial thoughts are that payments should be in one bucket or other not both as this would comprise the layer and any subsequent modelling from the layer.
09-29-2013 04:40 AM
Thanks for your reply... I don't think I explained my problem correctly... apologies. I am in the process of creating a predictive modeling data layer using SAS Base. I have been asked to bin the same payment into differing descriptive bins. My concern is that if a payment is binned into two differing bins will it lead the model and possible results to be incorrect due to these duplicate payments. I this maybe the case. However, I am not crash hot on predictive modeling using SAS EM.
10-01-2013 04:56 PM
Yes you can definitely code the same payment values into differing binning criteria/schema/rules/cuts/wishes. This is often seen with modelers using BASE. In case of using EM, it is not unusual to see a modeler add SAS Code Node to run his or her custom coding on the same variables, alongside whatever EM is doing with the variables, to compare and test. The logic behind this is: while rule of thumbs or general guidelines often apply, the 'best' cuts/bins often are determined by try and error.
After coding the same payment variables into differing bucket variables, you should, though, expect that they are highly correlated. Depending on specifics, sometimes you select one over the others. Sometimes you build them into PCA or factors. The reality is when the data, in your case the payment data, are typically NOT collected with any analytics in mind. The data just ENTERED into your database. You have to configure it to situate your models. The payment variable is like you foot. Of course you should try different pairs of shoes to decide which one fits the best.