Hi Experts,
I have one question regarding the data smoothing in SAS. let's say i have 4 years data for 10 million customers.
Sample data looks like this
Cutomer_ID | Year | values |
2 | 2014 | 40 |
2 | 2015 | 24 |
2 | 2016 | 14 |
2 | 2017 | 26 |
4 | 2014 | 1 |
4 | 2015 | 18 |
4 | 2016 | 9 |
4 | 2017 | 8 |
5 | 2014 | 30 |
5 | 2015 | 20 |
5 | 2016 | 10 |
5 | 2017 | 12 |
So from the above table, i have to create risk tag for each customer based on their four years score. but in this case, if the customer gets a good score in years and a bad score in 2 years.
So I need to classify based on the smoothing technique in SAS.
Please, can i get any code reference for this?
Thanks in advance.
Regards,
Anil
Four years is not very much data for smoothing. I think the best technique would be a simple one such as an average or a weighted average. If you use a weighted average, you probably want to assign more weight to the most recent data, such as
w = {0.125, 0.25, 0.5, 1};
or standardize by defining
w = w / sum(w);
If you use this to develop a score for Customer_ID = 2, the score for that customer would be
Score = 0.125*40 + 0.25*24 + 0.5*14 + 1*26 = 23.47
For a reference, you can read about various kinds of moving averages and how to compute a moving average in SAS. But if you know that you have exactly four observations for each ID, you can use BY-groups processing and the FIRST.ID and LAST.ID variables to add the value to the data set.
It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.
Use this tutorial as a handy guide to weigh the pros and cons of these commonly used machine learning algorithms.
Find more tutorials on the SAS Users YouTube channel.