Building models with SAS Enterprise Miner, SAS Factory Miner, SAS Visual Data Mining and Machine Learning or just with programming

Is this bias and will validation return useless results?

Posts: 32

Is this bias and will validation return useless results?

I have built a 'probability of purchase' prediction model for a set of retail banking customers.

By and large, I have included our entire customer base, limited to 'active' customers that actually have business with us.

We are now about to do a live (small scale) test of the model by displaying banners in our online banking solution.

By doing this, I will effectively be testing the model on a subset of the customer base since I will limit the test implicitly to customers who use the online banking solution.

Since my model is built on customers irrespective of whether they have (or use) the internet banking soltion - will this test and its results be valid?

Ie, since there is bias in my test population (only customers who log on to our online banking solution), is this a flaw that will give invalid test results?

Appreciate any advise Smiley Happy



Super Contributor
Posts: 337

Re: Is this bias and will validation return useless results?

Posted in reply to f_rederik

Hey Fred,

If I understand well, you trained a model using your active population. But you will only test the model on a subset of the active population, your customers with online banking.

Although you most likely expect different results, you can still use your model to find good insights about your active customers, and the incentives that drive their expected purchase.

Let's go through an example. Suppose you are building a model that predicts who in your active portfolio transfers a balance from other credit cards using a promotion with a special low rate. Your model finds out that historically, customers that take this promotion are men in their earlier 30's, who pay a yearly average interest greater or equal than 850 USD. Instead of sending this promotion via snail mail like every year, this time you are only displaying a banner on the welcome page of the ATM screens when a customer withdraws cash.

Suppose also that customers in this segment are used to pay everything with their credit cards, and rarely use cash or your ATM's.

Although you expect less customers to take the promotion because some of them will not see the banner, you can use your model to estimate the response of your customers in their 20's an in their 40's that do visit your ATM's.

It seems to me that you have all the pieces of information to evaluate how different your model could perform on a subset of your population. For instance, you can compare how your assessment metrics change across both your active population, and the subset that regularly uses online banking. Assess differences in the distribution of the predicted events or non-events, ROC curves, and fit statstics for the population and your subset.

I hope it helps,


Posts: 32

Re: Is this bias and will validation return useless results?

Posted in reply to M_Maldonado

Thanks Miguel that is very useful feedback.

If I understand your input correctly, you are essentially saying that the original plan would still yield somewhat valid result.

I did train my model on a larger population of customers, and the plan was to perform the test on a subset of the population.

As it turned out, it was not too complicated to relax the filters in my various jobs, so I have now actually trained a new model on a subset of the population that have identical properties to the intended test.

I will no doubt encouter similar scenarios, so I guess the question is how can I evaluate whether systematic differences in the train / test population will cause problems? In principle, I would probably want to have several purpose / product specific models that would work irrespective of channel (ie online banking / branch etc)...



Super Contributor
Posts: 337

Re: Is this bias and will validation return useless results?

Posted in reply to f_rederik

Hi Fred,

In a very strong statistical sense, to be absolutely valid, your subset test population needs to be as close as possible to a random sample. If you had to go with that model anyway, you can get a rough idea of what to expect by comparing the probability distribution of your model between your subset and your training population. If you are introducing bias, you can test if moving the cutoff of your predicted probabilities helps.

Now that you have a new model, it sounds like you already have the answer to your original question. If your models are different, it would have been really bad to use your original model in that subset.

In the future, assessing the predicted probabilities for your training population and for your subset training can give you a better light of whether your subset is a good candidate to test a model.

Good luck!

SAS Employee
Posts: 122

Re: Is this bias and will validation return useless results?

Posted in reply to f_rederik


You can test your model on a small random sample of the customers who only use online banking solution to see if the model is applicable to that cohort.

As a practical matter, models that are deployed for the 'right' segments can go wrong for all sorts of reason. A small scale, pre-test like this should take out potential complications embedded in full scale, 'live' deployment.

Models are supposed to be deployed for the intended segment. This is essential for model performance measurement. This does not mean the model has little or no applicability onto segments that do not overlap much with the original model universe. If a model is driven by variables that are shared between online banking and non-online banking customer bases, the model may very well stand up OK. What you can test is in building the model, create a flag or a set of variables to 'single out' those on banking customers only to see if such flag is predictively significant or will bias your model in any way. In some cases, such flag is significantly statistically but does not call for separate model for each. Sometime the flag may behave like a 'fault' that separates continents, when a separate model may be good idea.

At variable selection phase, you may pay attention to variables that are unique to one segment but not to the rest.

Best Regards

Jason Xin

Ask a Question
Discussion stats
  • 4 replies
  • 3 in conversation