BookmarkSubscribeRSS Feed
mccusker1818
Fluorite | Level 6

Hello,

 

I have bene tasked with performing predictive analytics for a couple of data sets and was wonedering if someone could double check my work.

 

I am trying to find the relation between several variables and Net_Sales.

 

My code:

proc corr data=MIS543.CLOTHINGORDERS pearson plots(maxpoints=none)=matrix;
var net_sale;
with quantity product_id territory_id unit_price;
run;

My results:

net sales corr.PNG

 Based on these results:

Quantity, territory_id, and unit_price all have positive correlation with net sales (meaning as they go up, so does the net sales, correct?

 

How is this predictive analytics? I would perform forecasting for net sales, but I have no time series data at my disposal.

 

Thanks

9 REPLIES 9
PaigeMiller
Diamond | Level 26

Given the way I use the word "predictive", PROC CORR won't do "predictive" analytics.

 

PROC GLM (or any other appropriate modeling procedure) will do predictive analysis, so you can actually predict what would happen if, for example: if quantity=24 and unit_price=83, what is the predicted value of net_sale?

 

In any modeling, and in PROC CORR, you cannot use territory_id and product_id as a continuous variable. These must be used as categories (CLASS variables) in PROC GLM.

--
Paige Miller
mccusker1818
Fluorite | Level 6

So when it comes to using PROC GLM, how would I code this for quantity and unit price to predict net sales?

 

PROC GLM DATA=SALESDATA;

CLASS ?

MODEL ?

RUN;

 

I am not familiar with proc glm...

PaigeMiller
Diamond | Level 26

General answer:

 

PROC GLM DATA=SALESDATA;

CLASS <class variables go here>;

MODEL net_sale = <continous variables go here> <class variables go here>;

RUN;

 

There are plenty of options in PROC GLM, as well as the ability to add interactions and polynomial terms. Take a look at the examples at https://documentation.sas.com/doc/en/pgmmvacdc/9.4/statug/statug_glm_examples.htm.

--
Paige Miller
mccusker1818
Fluorite | Level 6

Thank you for the assistance,

 

So my code is this:

PROC GLM DATA=mis543.clothingorders;
CLASS NET_SALE;
MODEL NET_SALE= QUANTITY UNIT_PRICE;
RUN;

Which gives me these results and corresponding contour fit plot.

cccccc.PNG

ddddddd.PNG

 How do I interpret these results?

 

PaigeMiller
Diamond | Level 26

Net_sale is not a predictor variable, it does not belong in the CLASS statement. Try taking the CLASS statement out and run it again.

--
Paige Miller
mccusker1818
Fluorite | Level 6

Ok I thought I had to have a class statement.

How does this look?

PROC GLM DATA=mis543.clothingorders;
MODEL NET_SALE= QUANTITY UNIT_PRICE;
RUN;

The results are similar, but I still don't really know how to interpret these results.

cccccc.PNG

ddddddd.PNG

 Thank you

PaigeMiller
Diamond | Level 26

The predictive equation for net_sale is

 

-428.2695 + 19.6373*quantity + 20.3367*unit_price

 

The R-square of >0.8 indicates that the model fits very well

 

The Pr>F is less than 0.05 (a "standard" cutoff for this value) indicating that the effects of both quantity and unit_price are statistically significant.

 

I'm wondering why this is an example for statistics, anyway. Isn't net_sale always equal to quantity*unit_price?? Maybe you throw in taxes into net_sale, so net_sale is (1+ tax rate)*(quantity*unit_price)?

--
Paige Miller
mccusker1818
Fluorite | Level 6
Thanks for all the help.
For my assignment I am supposed to use "predictive statistics" for a couple of datasets, but neither of the sets have a time id variable like months or quarters. I was planning on using PROC FORECAST but since my data lacks any time variables, I am not sure what type of predictive statistics I can actually do.
PaigeMiller
Diamond | Level 26

In my use of the word "predictive", no time variable is needed (although certainly you can do predictive modeling if you do have a time variable). If you are baking chocolate chip cookies, you have temperature, you have amount of shortening, and the response is some measure of taste, you can statistically determine a relationship using temperature and amount of shortening to predict the taste.

 

Your sales example seems to be weak for the reasons I mentioned. It's not a statistical relationship been sales and price per unit and quantity, it's a deterministic relationship.

 

 

--
Paige Miller

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 9 replies
  • 1799 views
  • 0 likes
  • 2 in conversation