Hi Experts,
I have one question regarding the data smoothing in SAS. let's say i have 4 years data for 10 million customers.
Sample data looks like this
Cutomer_ID | Year | values |
2 | 2014 | 40 |
2 | 2015 | 24 |
2 | 2016 | 14 |
2 | 2017 | 26 |
4 | 2014 | 1 |
4 | 2015 | 18 |
4 | 2016 | 9 |
4 | 2017 | 8 |
5 | 2014 | 30 |
5 | 2015 | 20 |
5 | 2016 | 10 |
5 | 2017 | 12 |
So from the above table, i have to create risk tag for each customer based on their four years score. but in this case, if the customer gets a good score in years and a bad score in 2 years.
So I need to classify based on the smoothing technique in SAS.
Please, can i get any code reference for this?
Thanks in advance.
Regards,
Anil
Four years is not very much data for smoothing. I think the best technique would be a simple one such as an average or a weighted average. If you use a weighted average, you probably want to assign more weight to the most recent data, such as
w = {0.125, 0.25, 0.5, 1};
or standardize by defining
w = w / sum(w);
If you use this to develop a score for Customer_ID = 2, the score for that customer would be
Score = 0.125*40 + 0.25*24 + 0.5*14 + 1*26 = 23.47
For a reference, you can read about various kinds of moving averages and how to compute a moving average in SAS. But if you know that you have exactly four observations for each ID, you can use BY-groups processing and the FIRST.ID and LAST.ID variables to add the value to the data set.
Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!
Use this tutorial as a handy guide to weigh the pros and cons of these commonly used machine learning algorithms.
Find more tutorials on the SAS Users YouTube channel.