BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
mccusker1818
Fluorite | Level 6

Hello,

 

My problem is to first find the top 5 facility countries that had the most sales, which I have done by using PROC FREQ.

help1.PNG

 which gave me the list of countries by number of sales shown below.

help2.PNG

 Now, I know the top 5 countries out of 30.

The next task is to predict future sales for 3 months for these (top 5) countries.

I have no idea how to do this... So far I have this code:

help3.PNG

 which gives me this plot:

help4.PNG

 I have no knowledge of how to use PROC FORECAST to project the next 3 months of sales...

Any help is appreciated!

Thank you

 

1 ACCEPTED SOLUTION

Accepted Solutions
sbxkoenk
SAS Super FREQ

Hello,

 

I would certainly check if the de-duplication of dates makes sense ( the way you did it ).

@SASKiwi has proposed to use NODUPKEY option on PROC SORT statement to get rid of duplicate dates (within one and the same facility), but he also warned that this may not be the right way ! Maybe you have to sum the sales amounts on 2 or 3 same dates within one facility. The latter can be done with PROC TIMESERIES.

 

With regard to the forecasting amount 3 times being the same, that's because the default model is simple (single) exponential smoothing (MODEL=SIMPLE).

 

There are 8 possible ESM's, you could try them all and choose the best one based on some goodness-of-fit criterion you have learned in class (like MAPE = Mean absolute percentage error).

 

Here's what you can put on the FORECAST statement behind the forward slash ( / ) :

MODEL=model-name

specifies the forecasting model to be used to forecast the time series. You can specify the following forecasting model-names:

NONE

produces no forecast, but the time series is appended with missing values in the OUT= data set. This option is useful when the results stored in the OUT= data set are used in a subsequent analysis where forecasts of the independent variables are needed to forecast the dependent variable.

SIMPLE

performs simple (single) exponential smoothing.

DOUBLE

performs double (Brown) exponential smoothing.

LINEAR

performs linear (Holt) exponential smoothing.

DAMPTREND

performs damped trend exponential smoothing.

ADDSEASONAL | SEASONAL

performs additive seasonal exponential smoothing.

MULTSEASONAL

performs multiplicative seasonal exponential smoothing.

WINTERS

uses the Winters multiplicative method.

ADDWINTERS

uses the Winters additive method.

By default, MODEL=SIMPLE.

Good luck,

Koen

View solution in original post

16 REPLIES 16
sbxkoenk
SAS Super FREQ

Hello,

 

For an alternative to your PROC FREQ code, see here :

4 ways to find the k smallest and largest data values in SAS
By Rick Wicklin on The DO Loop January 26, 2022
https://blogs.sas.com/content/iml/2022/01/26/k-smallest-largest-data.html

 

To forecast 3 months ahead:

Do not use PROC FORECAST !! It's obsolete and discarded from the documentation.

If you want an easy method, use PROC ESM (exponential smoothing models with optimized smoothing weights).

Forecasting is a profession in itself. If you build up knowledge, you can use ARIMA models (without or with additional explanatory factors) or other methodologies. ESM is very basic!

 

Good luck,

Koen

mccusker1818
Fluorite | Level 6

Hi there, 

Thanks for the response, I was starting to think no one was going to reply.

I have never used PROC ESM, and this is not discussed in my textbook... 

 

How would I write the code for finding 3 months of sales data for each country?

 This is my code:

help6.PNG

 And the errors I receive:

help7.PNG

 Thank you for any help

sbxkoenk
SAS Super FREQ

Hello,

 

What is the forecasting method that your text book discusses?

 

If you are OK with using Exponential Smoothing, you can try this code :

proc esm data=mis543.toys out=NextMonths_Toys LEAD=3 PLOT=FORECASTS;
   by FacilityCountry;
   id TransactionMonth interval=month;
   forecast ProductPriceActual;
run;

Koen

mccusker1818
Fluorite | Level 6

Thanks for the help.

When I try the code you suggested, I get multiple errors..

help8.PNG

 I don't know if it changes the code required, but I only need to predict the sales for 5 country facilities including:

"United"

"Spain"

"Sweden"

"German"

"Austra"

 

(This is how they're spelled in the dataset, its not a typo)

sbxkoenk
SAS Super FREQ

Hello,

 

Your data is not sorted by country facility and (within the former) by timestamp.

On the ID statement you can add the NOTSORTED option.

But you need to sort at least on the variable that contains your by-groups.

 

Further more, you have duplicate dates (within one and the same by-group). Get rid of the duplicate dates. Duplicate dates are not allowed in time series analysis and forecasting. I would use PROC TIMESERIES to collapse (summarize) info from x times the same date in one record (observation).

 

Cheers,

Koen

mccusker1818
Fluorite | Level 6

I do not know how to sort data or change the dates to SAS format. Its odd that this assignment for my class is more about learning how to forecast sales and yet I was provided a dataset that needs to be changed so much. I feel that I should have been taught that first.

 

How do I sort the country variable? And how can I remove duplicate dates?

 

Thanks

SASKiwi
PROC Star

You use PROC SORT to sort your data. Try this:

proc sort data=mis543.toys 
              out=toys;
   by FacilityCountry TransactionMonth;
run;

proc esm data = toys.....
.......
mccusker1818
Fluorite | Level 6

After running this code:

help9.PNG

 I get 30 errors:

help10.PNG

 I feel like I am so close! How do I fix this?

SASKiwi
PROC Star

Changing the PROC SORT to remove rows with duplicate keys may get rid of the errors:

proc sort data=mis543.toys 
              out=toys
              nodupkey;
   by FacilityCountry TransactionMonth;
run;

However this might not be the correct way to deal with the duplicate rows. You need to understand your input data better to decide on the right way.

mccusker1818
Fluorite | Level 6

You have been a lot of help. Thank you so much for the assistance. I am still not sure what is wrong with this dataset, but I no longer have any errors! As you can see, the predicted "productpriceactual" (sales) has been forecasted for Jan-March 2018 (SUCCESS!) however, the numbers are all the same? Seems very odd... They are like this for all of the FacilityCountries.

help14.PNG

sbxkoenk
SAS Super FREQ

Hello,

 

I would certainly check if the de-duplication of dates makes sense ( the way you did it ).

@SASKiwi has proposed to use NODUPKEY option on PROC SORT statement to get rid of duplicate dates (within one and the same facility), but he also warned that this may not be the right way ! Maybe you have to sum the sales amounts on 2 or 3 same dates within one facility. The latter can be done with PROC TIMESERIES.

 

With regard to the forecasting amount 3 times being the same, that's because the default model is simple (single) exponential smoothing (MODEL=SIMPLE).

 

There are 8 possible ESM's, you could try them all and choose the best one based on some goodness-of-fit criterion you have learned in class (like MAPE = Mean absolute percentage error).

 

Here's what you can put on the FORECAST statement behind the forward slash ( / ) :

MODEL=model-name

specifies the forecasting model to be used to forecast the time series. You can specify the following forecasting model-names:

NONE

produces no forecast, but the time series is appended with missing values in the OUT= data set. This option is useful when the results stored in the OUT= data set are used in a subsequent analysis where forecasts of the independent variables are needed to forecast the dependent variable.

SIMPLE

performs simple (single) exponential smoothing.

DOUBLE

performs double (Brown) exponential smoothing.

LINEAR

performs linear (Holt) exponential smoothing.

DAMPTREND

performs damped trend exponential smoothing.

ADDSEASONAL | SEASONAL

performs additive seasonal exponential smoothing.

MULTSEASONAL

performs multiplicative seasonal exponential smoothing.

WINTERS

uses the Winters multiplicative method.

ADDWINTERS

uses the Winters additive method.

By default, MODEL=SIMPLE.

Good luck,

Koen

mccusker1818
Fluorite | Level 6

You have all been a tremendous help with this assignment. 

 

I really appreciate it! 

proc sort data=mis543.toys 
out=toys
nodupkey;
by FacilityCountry TransactionMonth;
run;proc sort data=work.toys
out=sorttoys;
by FacilityCountry TransactionMonth;
run;

proc esm data=work.sorttoys out=NextMonths_Toys LEAD=3 PLOT=FORECASTS;
by FacilityCountry;
id TransactionMonth interval=month;
forecast ProductPriceActual / model=addwinters;
run;

 I think this is the right code! Definitely feeling less defeated.

 

Thanks Again.

 

sbxkoenk
SAS Super FREQ

Hello,

 

The 2nd PROC SORT will do nothing.
work.toys and work.sorttoys are entirely equal (including the sort order of the observations).

Here's alternative code (in case nodupkey is not right and you want to sum the amounts that have the same date (within a facility)).

proc sort data=mis543.toys 
           out=work.toys;
 by FacilityCountry TransactionMonth;
run;

proc timeseries data=work.toys out=work.mtoys;
   by FacilityCountry ;
   id TransactionMonth interval=month
                       accumulate=TOTAL /*median*/
                       setmiss=0
                       /* start='01jan1998'd */
                       /* end  ='31dec2000'd */  ;
   var ProductPriceActual;
run;

proc esm data=work.mtoys out=NextMonths_Toys LEAD=3 PLOT=FORECASTS;
   by FacilityCountry;
   id TransactionMonth interval=month;
   forecast ProductPriceActual / model=addwinters;
run;

Good luck,

Koen

eaking
Calcite | Level 5

I think I'm taking this same class that you were. I am in the same exact boat! Sorted to find top five countries, NOW WHAT!? Textbook is NO help. Modules hardly touch on this subject. (Also, my internet was out for the past two days, causing me to work on this one and two days before it's due, little time to ask prof for help!)

 

What does "sales" even mean? I interpreted it the same way you did - a frequency count of observations in each country because each row seems to be unique by product ordered. But does sales mean total dollars from "ProductPriceActual"? Do I need to add up all the actual price per country?

 

I feel like I know what I'd like to do with this dataset to get started, but it's so humongous for just a homework assignment, and it also feels disconnected from the course material. Also, all help I'm finding online is about "forecasting," but forecasting is never mentioned in this course. 

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

Multiple Linear Regression in SAS

Learn how to run multiple linear regression models with and without interactions, presented by SAS user Alex Chaplin.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 16 replies
  • 4011 views
  • 5 likes
  • 5 in conversation