Hello,
I have bene tasked with performing predictive analytics for a couple of data sets and was wonedering if someone could double check my work.
I am trying to find the relation between several variables and Net_Sales.
My code:
proc corr data=MIS543.CLOTHINGORDERS pearson plots(maxpoints=none)=matrix;
var net_sale;
with quantity product_id territory_id unit_price;
run;
My results:
Based on these results:
Quantity, territory_id, and unit_price all have positive correlation with net sales (meaning as they go up, so does the net sales, correct?
How is this predictive analytics? I would perform forecasting for net sales, but I have no time series data at my disposal.
Thanks
Given the way I use the word "predictive", PROC CORR won't do "predictive" analytics.
PROC GLM (or any other appropriate modeling procedure) will do predictive analysis, so you can actually predict what would happen if, for example: if quantity=24 and unit_price=83, what is the predicted value of net_sale?
In any modeling, and in PROC CORR, you cannot use territory_id and product_id as a continuous variable. These must be used as categories (CLASS variables) in PROC GLM.
So when it comes to using PROC GLM, how would I code this for quantity and unit price to predict net sales?
PROC GLM DATA=SALESDATA;
CLASS ?
MODEL ?
RUN;
I am not familiar with proc glm...
General answer:
PROC GLM DATA=SALESDATA;
CLASS <class variables go here>;
MODEL net_sale = <continous variables go here> <class variables go here>;
RUN;
There are plenty of options in PROC GLM, as well as the ability to add interactions and polynomial terms. Take a look at the examples at https://documentation.sas.com/doc/en/pgmmvacdc/9.4/statug/statug_glm_examples.htm.
Thank you for the assistance,
So my code is this:
PROC GLM DATA=mis543.clothingorders;
CLASS NET_SALE;
MODEL NET_SALE= QUANTITY UNIT_PRICE;
RUN;
Which gives me these results and corresponding contour fit plot.
How do I interpret these results?
Net_sale is not a predictor variable, it does not belong in the CLASS statement. Try taking the CLASS statement out and run it again.
Ok I thought I had to have a class statement.
How does this look?
PROC GLM DATA=mis543.clothingorders;
MODEL NET_SALE= QUANTITY UNIT_PRICE;
RUN;
The results are similar, but I still don't really know how to interpret these results.
Thank you
The predictive equation for net_sale is
-428.2695 + 19.6373*quantity + 20.3367*unit_price
The R-square of >0.8 indicates that the model fits very well
The Pr>F is less than 0.05 (a "standard" cutoff for this value) indicating that the effects of both quantity and unit_price are statistically significant.
I'm wondering why this is an example for statistics, anyway. Isn't net_sale always equal to quantity*unit_price?? Maybe you throw in taxes into net_sale, so net_sale is (1+ tax rate)*(quantity*unit_price)?
In my use of the word "predictive", no time variable is needed (although certainly you can do predictive modeling if you do have a time variable). If you are baking chocolate chip cookies, you have temperature, you have amount of shortening, and the response is some measure of taste, you can statistically determine a relationship using temperature and amount of shortening to predict the taste.
Your sales example seems to be weak for the reasons I mentioned. It's not a statistical relationship been sales and price per unit and quantity, it's a deterministic relationship.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.