hello,
I am looking for a proc recommendation to apply to the following problem.
Industry: bank > credit card;
Full Population: 800k;
(Outage) directly impacted population: 200k;
Question: of the 200k, identify the delinquent population (late payment) likely resulting from the Outage (as opposed to delinquent customers who would have been delinquent irrespective of the outage).
Description: During a few days in March, the monthly financial statement was delivered to customers blank due to glitch. I am trying to provide an estimate on : of the customers that were sent a blank statement, and after having isolated how many of these customers missed their payment on the due date (delinquent), determine how many were likely delinquent due to the march statement issue(as opposed to who would have been delinquent irrespective of the issue, as customer has habit of falling late in payments).
I can go back 12 months to get delinquency trending.
Any tips or suggested approaches is highly appreciated, thanks in advance.
Before you can choose a SAS PROC to use, you have to create/explain a methodology for performing this analysis. Then, you can determine which PROC(s) to use.
So, suppose there were only 5 customer, what analysis would you do? What steps are there in the analysis? Can you write out pencil and paper (or pen and paper; or text editor; or Powerpoint) the exact logic and mathematical steps??
here is start?
(i) of the 200k impacted, identify those that are delinquent in given month (say march)
(ii) then, in the last 12 months, capture monthly volume of delinquencies and non-delinquencies
Based on trending, apply suitable statistical test. Can someone suggest one that may be amenable to the scenario described?
I'm looking for an output that provides statistical probability that the identified population got delinquent due to the missing statement, over other factors. Apologies if this description is not helpful
I'm reading between the lines here, this is what I think you are trying to say (and please tell me if this is not what you are trying to say)
Assume the month of interest is March
Use the last 12 months (March of previous year through February of current year) trend of delinquency percent, predict delinquency percent of March of this year, then compare predicted delinquency percent to actual deliquency percent in March of current year.
However, none of this is capable of identifying which individual loans are delinquent because of the outage, and which are not. Which is what I think you meant when you said "of the 200k, identify the delinquent population (late payment) likely resulting from the Outage".
Ok so thanks for your feedback/answer... Not sure if this makes a difference, i'm less interested in "...of capable of identifying which individual loans are delinquent because of the outage" where I want to de-emphasize 'because'. Apologies If I missexplained earlier.
If I may re-phrase it as: based on past, say 12-month historical payment behavior of customer, assign each a delinquent risk value.
Create different buckets/categories segregated by risk value thresholds.
Finally, assign the customers that got delinquent in april (from missing march statement) into the Just created categories (based on the delinquent risk value.
Ultimately, if we see a biger ratio of low risk customers Delinquent in April, that could be indicative that the March issue with the statement had an effect or interaction effect in terms of causing/leading to their delinquency. This is to help a business leader determine whether or not to consider a full/partial refund for that particular month.
Of course, the above is high level concept... I'll have access to fresh data in two weeks. Until then i'll research further and see what I can come up, and if I can workout something that is interesting, will post here. Thank you
So now that I think I understand better, I believe there are a number of methods to assign risk of delinquency to each customer.
One that comes to mind is to do a discriminant analysis (PROC DISCRIM) where the 12 previous payments (delinquent or non-delinquent) are used to predict the next probability (risk) of the next payment being delinquent or non-delinquent. I have some misgivings about this, in particular it would ignore any seasonality that might be present, where the entire population might be more likely to be delinquent when the Christmas shopping bills arrive; and less likely to be delinquent after the April tax refund checks arrive. But discriminant analysis might be a good place to start. Other credit factors might also be used. And if you have more than 12 months of data, I'd certainly use that.
thanks for the lead and tip; will look into!
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.
Find more tutorials on the SAS Users YouTube channel.