Fluorite | Level 6

## Project 3 months of sales for Top 5 facilities

Hello,

My problem is to first find the top 5 facility countries that had the most sales, which I have done by using PROC FREQ.

which gave me the list of countries by number of sales shown below.

Now, I know the top 5 countries out of 30.

The next task is to predict future sales for 3 months for these (top 5) countries.

I have no idea how to do this... So far I have this code:

which gives me this plot:

I have no knowledge of how to use PROC FORECAST to project the next 3 months of sales...

Any help is appreciated!

Thank you

1 ACCEPTED SOLUTION

Accepted Solutions
SAS Super FREQ

## Re: Project 3 months of sales for Top 5 facilities

Hello,

I would certainly check if the de-duplication of dates makes sense ( the way you did it ).

@SASKiwi has proposed to use NODUPKEY option on PROC SORT statement to get rid of duplicate dates (within one and the same facility), but he also warned that this may not be the right way ! Maybe you have to sum the sales amounts on 2 or 3 same dates within one facility. The latter can be done with PROC TIMESERIES.

With regard to the forecasting amount 3 times being the same, that's because the default model is simple (single) exponential smoothing (MODEL=SIMPLE).

There are 8 possible ESM's, you could try them all and choose the best one based on some goodness-of-fit criterion you have learned in class (like MAPE = Mean absolute percentage error).

Here's what you can put on the FORECAST statement behind the forward slash ( / ) :

MODEL=model-name

specifies the forecasting model to be used to forecast the time series. You can specify the following forecasting model-names:

NONE

produces no forecast, but the time series is appended with missing values in the OUT= data set. This option is useful when the results stored in the OUT= data set are used in a subsequent analysis where forecasts of the independent variables are needed to forecast the dependent variable.

SIMPLE

performs simple (single) exponential smoothing.

DOUBLE

performs double (Brown) exponential smoothing.

LINEAR

performs linear (Holt) exponential smoothing.

DAMPTREND

performs damped trend exponential smoothing.

MULTSEASONAL

performs multiplicative seasonal exponential smoothing.

WINTERS

uses the Winters multiplicative method.

By default, MODEL=SIMPLE.

Good luck,

Koen

16 REPLIES 16
SAS Super FREQ

## Re: Project 3 months of sales for Top 5 facilities

Hello,

For an alternative to your PROC FREQ code, see here :

4 ways to find the k smallest and largest data values in SAS
By Rick Wicklin on The DO Loop January 26, 2022
https://blogs.sas.com/content/iml/2022/01/26/k-smallest-largest-data.html

Do not use PROC FORECAST !! It's obsolete and discarded from the documentation.

If you want an easy method, use PROC ESM (exponential smoothing models with optimized smoothing weights).

Forecasting is a profession in itself. If you build up knowledge, you can use ARIMA models (without or with additional explanatory factors) or other methodologies. ESM is very basic!

Good luck,

Koen

Fluorite | Level 6

## Re: Project 3 months of sales for Top 5 facilities

Hi there,

Thanks for the response, I was starting to think no one was going to reply.

I have never used PROC ESM, and this is not discussed in my textbook...

How would I write the code for finding 3 months of sales data for each country?

This is my code:

Thank you for any help

SAS Super FREQ

## Re: Project 3 months of sales for Top 5 facilities

Hello,

What is the forecasting method that your text book discusses?

If you are OK with using Exponential Smoothing, you can try this code :

``````proc esm data=mis543.toys out=NextMonths_Toys LEAD=3 PLOT=FORECASTS;
by FacilityCountry;
id TransactionMonth interval=month;
forecast ProductPriceActual;
run;``````

Koen

Fluorite | Level 6

## Re: Project 3 months of sales for Top 5 facilities

Thanks for the help.

When I try the code you suggested, I get multiple errors..

I don't know if it changes the code required, but I only need to predict the sales for 5 country facilities including:

"United"

"Spain"

"Sweden"

"German"

"Austra"

(This is how they're spelled in the dataset, its not a typo)

SAS Super FREQ

## Re: Project 3 months of sales for Top 5 facilities

Hello,

Your data is not sorted by country facility and (within the former) by timestamp.

On the ID statement you can add the NOTSORTED option.

But you need to sort at least on the variable that contains your by-groups.

Further more, you have duplicate dates (within one and the same by-group). Get rid of the duplicate dates. Duplicate dates are not allowed in time series analysis and forecasting. I would use PROC TIMESERIES to collapse (summarize) info from x times the same date in one record (observation).

Cheers,

Koen

Fluorite | Level 6

## Re: Project 3 months of sales for Top 5 facilities

I do not know how to sort data or change the dates to SAS format. Its odd that this assignment for my class is more about learning how to forecast sales and yet I was provided a dataset that needs to be changed so much. I feel that I should have been taught that first.

How do I sort the country variable? And how can I remove duplicate dates?

Thanks

Opal | Level 21

## Re: Project 3 months of sales for Top 5 facilities

You use PROC SORT to sort your data. Try this:

``````proc sort data=mis543.toys
out=toys;
by FacilityCountry TransactionMonth;
run;

proc esm data = toys............``````
Fluorite | Level 6

## Re: Project 3 months of sales for Top 5 facilities

After running this code:

I get 30 errors:

I feel like I am so close! How do I fix this?

Opal | Level 21

## Re: Project 3 months of sales for Top 5 facilities

Changing the PROC SORT to remove rows with duplicate keys may get rid of the errors:

``````proc sort data=mis543.toys
out=toys
nodupkey;
by FacilityCountry TransactionMonth;
run;``````

However this might not be the correct way to deal with the duplicate rows. You need to understand your input data better to decide on the right way.

Fluorite | Level 6

## Re: Project 3 months of sales for Top 5 facilities

You have been a lot of help. Thank you so much for the assistance. I am still not sure what is wrong with this dataset, but I no longer have any errors! As you can see, the predicted "productpriceactual" (sales) has been forecasted for Jan-March 2018 (SUCCESS!) however, the numbers are all the same? Seems very odd... They are like this for all of the FacilityCountries.

SAS Super FREQ

## Re: Project 3 months of sales for Top 5 facilities

Hello,

I would certainly check if the de-duplication of dates makes sense ( the way you did it ).

@SASKiwi has proposed to use NODUPKEY option on PROC SORT statement to get rid of duplicate dates (within one and the same facility), but he also warned that this may not be the right way ! Maybe you have to sum the sales amounts on 2 or 3 same dates within one facility. The latter can be done with PROC TIMESERIES.

With regard to the forecasting amount 3 times being the same, that's because the default model is simple (single) exponential smoothing (MODEL=SIMPLE).

There are 8 possible ESM's, you could try them all and choose the best one based on some goodness-of-fit criterion you have learned in class (like MAPE = Mean absolute percentage error).

Here's what you can put on the FORECAST statement behind the forward slash ( / ) :

MODEL=model-name

specifies the forecasting model to be used to forecast the time series. You can specify the following forecasting model-names:

NONE

produces no forecast, but the time series is appended with missing values in the OUT= data set. This option is useful when the results stored in the OUT= data set are used in a subsequent analysis where forecasts of the independent variables are needed to forecast the dependent variable.

SIMPLE

performs simple (single) exponential smoothing.

DOUBLE

performs double (Brown) exponential smoothing.

LINEAR

performs linear (Holt) exponential smoothing.

DAMPTREND

performs damped trend exponential smoothing.

MULTSEASONAL

performs multiplicative seasonal exponential smoothing.

WINTERS

uses the Winters multiplicative method.

By default, MODEL=SIMPLE.

Good luck,

Koen

Fluorite | Level 6

## Re: Project 3 months of sales for Top 5 facilities

You have all been a tremendous help with this assignment.

I really appreciate it!

`proc sort data=mis543.toys               out=toys              nodupkey;   by FacilityCountry TransactionMonth;run;proc sort data=work.toys               out=sorttoys;   by FacilityCountry TransactionMonth;run;proc esm data=work.sorttoys out=NextMonths_Toys LEAD=3 PLOT=FORECASTS;   by FacilityCountry;   id TransactionMonth interval=month;   forecast ProductPriceActual / model=addwinters;run;`

I think this is the right code! Definitely feeling less defeated.

Thanks Again.

SAS Super FREQ

## Re: Project 3 months of sales for Top 5 facilities

Hello,

The 2nd PROC SORT will do nothing.
work.toys and work.sorttoys are entirely equal (including the sort order of the observations).

Here's alternative code (in case nodupkey is not right and you want to sum the amounts that have the same date (within a facility)).

``````proc sort data=mis543.toys
out=work.toys;
by FacilityCountry TransactionMonth;
run;

proc timeseries data=work.toys out=work.mtoys;
by FacilityCountry ;
id TransactionMonth interval=month
accumulate=TOTAL /*median*/
setmiss=0
/* start='01jan1998'd */
/* end  ='31dec2000'd */  ;
var ProductPriceActual;
run;

proc esm data=work.mtoys out=NextMonths_Toys LEAD=3 PLOT=FORECASTS;
by FacilityCountry;
id TransactionMonth interval=month;
run;``````

Good luck,

Koen

Calcite | Level 5

## Re: Project 3 months of sales for Top 5 facilities

I think I'm taking this same class that you were. I am in the same exact boat! Sorted to find top five countries, NOW WHAT!? Textbook is NO help. Modules hardly touch on this subject. (Also, my internet was out for the past two days, causing me to work on this one and two days before it's due, little time to ask prof for help!)

What does "sales" even mean? I interpreted it the same way you did - a frequency count of observations in each country because each row seems to be unique by product ordered. But does sales mean total dollars from "ProductPriceActual"? Do I need to add up all the actual price per country?

I feel like I know what I'd like to do with this dataset to get started, but it's so humongous for just a homework assignment, and it also feels disconnected from the course material. Also, all help I'm finding online is about "forecasting," but forecasting is never mentioned in this course.

Discussion stats
• 16 replies
• 1187 views
• 5 likes
• 5 in conversation