Im thinking of applying machine Learning algorithms on a data set of a Portfolio of credit contracts. The dataset is huge. I got around 22 Million contracts with around 100 Million data rows from a 2000 to 2015 (Panel Data) and up to 30 individual characteristic variables.
With this data set I want to estimate the so called "prepayment risk". Generally speaking, I have a regression problem where I want to estimate the parameters in order to forecast the probability of prepaying a loan within a group of clients
In the literature such models are basically estimated with a logistic Regression because the dependend variable is usually discretized. With some extended things were also modelled in a survival Analysis modell.
The goal of my research should be, how or if neural networks can improve the estimation compared to a logistic regression. Hence I want to get a step further and estimate my parameters with a deep learning ANN. Unfortunately I don't have any experience with machine learning, but I'm a graduate student of econometrics.
Now I want to read some opinions from some "experts". Do you think this is a good idea for a master thesis? I'm afraid that in the neural Network Approach I might get stucked at some point.
Could you give me some advise about some good papers and books? Maybe some online lectures which are worth to watch?
Can these things be done in SAS?
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Use this tutorial as a handy guide to weigh the pros and cons of these commonly used machine learning algorithms.
Find more tutorials on the SAS Users YouTube channel.