06-01-2017 05:50 PM

Im thinking of applying machine Learning algorithms on a data set of a Portfolio of credit contracts. The dataset is huge. I got around 22 Million contracts with around 100 Million data rows from a 2000 to 2015 (Panel Data) and up to 30 individual characteristic variables.

With this data set I want to estimate the so called "prepayment risk". Generally speaking, I have a regression problem where I want to estimate the parameters in order to forecast the probability of prepaying a loan within a group of clients

In the literature such models are basically estimated with a logistic Regression because the dependend variable is usually discretized. With some extended things were also modelled in a survival Analysis modell.

The goal of my research should be, how or if neural networks can improve the estimation compared to a logistic regression. Hence I want to get a step further and estimate my parameters with a deep learning ANN. Unfortunately I don't have any experience with machine learning, but I'm a graduate student of econometrics.

Now I want to read some opinions from some "experts". Do you think this is a good idea for a master thesis? I'm afraid that in the neural Network Approach I might get stucked at some point.

Could you give me some advise about some good papers and books? Maybe some online lectures which are worth to watch?

Can these things be done in SAS?

06-02-2017 10:50 AM

I'm *not* an expert on doing it in SAS, but I don't believe the neural nets in SAS are "deep." You may want to investigate Keras using Theano or Tensorflow (implemented in Python most likely) in order to do something with Deep Learning. The Keras website (https://keras.io/) is extremely detailed with tons of documentation... basically the "book" you want, just delivered as a website. As a SAS user, I would find your research fascinating if you got it to work as part of an overall SAS process. :-)