02-02-2015 06:21 PM
I’m writing this first post basically because I’m getting crazy in analyzing a set of quite complicated data
It’s been several years that I’ve been dealing with data mining issues, but most focused on classification and association problems. In this case I have to develop a multivariate regression model in order to explain how investments in marketing, communication and HR training impact on commercial output (e.g. sales) of different kind of products. The purpose of the model is of course explicative, but also predictive as the final result must be a sort of simulation dashboard (useful for planning the investments in future campaigns).
Structure of data: longitudinal, time series cross sectional à weekly observations (3 years long) for 5 different product categories. So I have about 750 observations. Dependent variable: product category sales; independent variables: about 200 potential predictors.
Ok, nothing particularly strange.. but..
Ok, this is the framework. I tried a lot of models (arimax, glm, pooled regression..), but nothing worked fine. Poor fit and poor interpretation. Well I need some good tips.. any idea?
Thanks for your help!
02-08-2015 06:23 PM
If you plan to measure the effectiveness of marketing campaigns in the future I would highly recommend investing resources in design of experiments so that you can use a control group and incremental response models. Take a look at this thread ( ) for a paper and a video. You can find more information on Enterprise Miner Reference Help (press F1 on Enterprise Miner). This kind of model will give you a better light on what factors impact your sales.
In the meantime I can offer these suggestions.
Time Series Analysis
It seems to me that you should have an Arima model for each product. And you might want to try using just the last year of information (perhaps the 52 weeks that best capture seasonal effects like Easter, Christmass, and other important holidays in your data?). If you think that the null values are tripping off your model, turn those variables into an accumulator (for example dollars invested to date instead of dollars invested on a given month).
I wouldn't think that you have to standardize your predictors (indexes, money counts, etc). But give it a try with/without proc stdize before modeling) and see how it goes.
I am not a time series expert but I would think that model selection would take care of autocorrelation. I found this doc that might come handy for selection (look for automatic model selection in SAS/ETS(R) 9.2 User's Guide).
If your audience is more familiar with regressions than with ETS analysis, try aggregating product sales by product category and training quarterly or semester models. I would hope that summarizing your 200 predictors into Q1 and Q2 are great predictors of the sales of Q3. Use variable selection so that you end up with only the most predictive handful of predictors.
Personally I would prefer this second approach as I find it easier to explain. Since you want to do both explain and predict, you might have to take the best of each model. For example you may want to use your regression model to find the most important factors that drive your product sales. But you will actually use your time series model to predict future sales.
It sounds like you are on the right track. Good luck with your models!
02-13-2015 10:19 AM
Thanks Miguel, very useful suggestions!
The accumulator tip could be very interesting.. now we are proceeding with ARIMAX models with good results. We standardized the dependent variable (creating a more homogenous index variable) and it seems it worked out. Let's see what happen after the last fine tuning..
Consider that we do not have information on the single customer, so it would be difficult to use uplift models in this case.
02-13-2015 10:24 AM
Great to hear that Milo!
Just curious, what are you using to standardize just proc stdize? If you are using something else, can I borrow an example from you? Someone asked about this the other day...
Bummer that you don't have info per customer...
Good luck with your Arimax!