Hello Data Mining Experts. Need your advice here 🙂
Never built any predictive model before but I have read a lot
I have a simple data of 28 months .
Each month has 2 variables x and y denoting number of clicks (either 0 or 1) happened on a daily basis
I create another variable z=sum(x)/sum(y).
So having these 28 tables and the value of z in each of the 28 tables I need to forecast the value Z in the year of 2019
What I need to understand here is :
1.will z of all these tables be enough to predict its value in 2019( model z=sum(x) sum(y)) ?
2. do I need to predict sum(x) and sum(y) first if that is possible before forecasting z?
3/ What would be the right model to accomplish 1 or 1 and 2?
This might be a silly question but I would take any advice :). Thanks
January:
x y
------
0 1
1 1
0 0
....
....
....
The only way to predict z for 2019 is to have predictions for sum(x) in 2019 and sum(y) in 2019. This requires a time series model, which could be seasonal, could be auto-regressive, or some other type of time series to predict the next year.
There's a lot of information that needs to be considered when you select a modeling technique, before you can get any reasonable answer, and if I were you, I'd find a statistician at your company or university, and consult with him or her.
Some of the many things you would need to consider (not a complete list)
Thanks for the quick response PaigeMiller
1. don't know what u mean by seasonal
2.it will always be linearly dependent on sum(x)/sum(y)
3.Did not get that. I mean z=sum(x)/sum(y) so it depends on both variables
No statisticians in my team or nearby 😞
The only way to predict z for 2019 is to have predictions for sum(x) in 2019 and sum(y) in 2019. This requires a time series model, which could be seasonal, could be auto-regressive, or some other type of time series to predict the next year.
Thank you Paige Miller,
That is a good start. Will look for some info on time series models
Hi Paige, so here is my actual data. This looks like trend right, linear trend?
and I can use the below to predict values for 12 months
proc forecast data=tt interval=month lead=12 out=yyy;
id date;
var sum_x sum_y; run;
I also looked it up and saw proc forecast with hold,stepar,ses methods...
Would you recommend some of those?
date sum_x sum_y
----------------------------
201511 3088 7828
201512 2312 7260
201601 2415 5331
201602 2498 5411
201603 3001 6470
201604 3383 7333
201605 3709 8078
201606 4670 10331
201607 4070 9153
201608 4092 9194
201609 3780 8299
201610 3239 6871
201611 3270 7044
201612 2528 5295
201701 1831 4062
201702 1810 3832
201704 2119 4845
201705 6156 13772
201706 3784 8590
201707 3149 7437
201708 3188 7587
201709 2801 6726
201710 2702 6234
201711 2548 5746
201712 2382 5506
201801 1482 3455
201802 1412 3475
201803 1644 4031
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Use this tutorial as a handy guide to weigh the pros and cons of these commonly used machine learning algorithms.
Find more tutorials on the SAS Users YouTube channel.