BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
stlimpbizkit
Calcite | Level 5

Hello SAS community friends,

Previously I asked about Call SYMPUT function and whether it can assign values to macro variable for each record based on existing data field and retain that value for data step before moving on to the next record. The new question is similar and yet different, using a similar example below for demonstrative purpose. If you are interested, feel free to check out the original question that is helped and answered by the community: https://communities.sas.com/message/277182#277182.

A bit of context - I have a list of accounts that were opened at various point in time and hence with different performance length, measured by field "month_since_opened". The performance I am measuring is spend. Based on the available data points, I am to forecast the future spend and extrapolate the spend curve based on a known shape, represented by a series of shape factors (not shown in example for simplicity).

Acct_NoMonth_since_openedSpend_month_1Spend_month_2Spend_month_3Spend_month_4Spend_month_5Spend_month_6Spend_month_7
15500450600520250
250300250300250
34150350250200.
43600510460..
52500450...

My challenge is to create a moving reference to produce forecast based on spend data from the trailing month, or last available month for each account to be exact. I haven't been able to write the statement correctly (in blue below).


%let count=60;                                                                                               /*assume we need to forecast up to 5 years/ 60 months of performance*/

data output;

set input;

array spend (60);                                                                                            /*In my previous post, array function was recommended as a solution to create the shell*/

do i = months_since_opened to &count;

spend (i ) = spend (i-1) / shape factor (i-1) * shape factor i                 /* simplified algorithm: divided spend from month i-1 by shape factor i-1 to get "normalized" spend value , then apply shape factor at month i to forecast spend at month i; shape factors are existing field by month similar to spend_month_n*/

end;

run;

Hope the way question is explained makes sense, and thanks in advance for any hint/ideas you might be able to share with me!

1 ACCEPTED SOLUTION

Accepted Solutions
stlimpbizkit
Calcite | Level 5

Hello everyone, with everyone's input I think I am getting close. To provide closure and give back to the group as future reference, below is the draft answer. There is more refinement to be done in my actual analysis, but it should be sufficient to address my original ask:

You can see that an extra j was introduced besides i. Variable i is to loop through future months to create forecast, while variable j is to lock in the month_since_opened for each record so that forecast starts at the incremental month based on how many months of actual spend are available.

%let count=60;

data output;

set input;

array forecast_spend (60);

array actual_spend spend1-spend60;

array shape shape1-shape60;

do i = month_since_opened+1 to &count;

%let j = max_mob;

forecast_spend(i ) = actual_spend(&j) / shape(&j) * shape (i);

end;

run;

Thanks guys for your inputs: ballardw, astounding, reeza

View solution in original post

20 REPLIES 20
ballardw
Super User

Is the "data" shown the input or the output?

The pseudo-code you show references  a shape factor array, you need to declare and populate that at least.

stlimpbizkit
Calcite | Level 5

Thanks for the feedback. That was valid point. I have updated the original content to: 1). reflect the correct input / output dataset; 2) add note that shape factors are a series of existing fields by month

Astounding
PROC Star

One piece of the puzzle:  Your code is going to replace the Spend variables with new values.  Instead, create a second set of variables to hold the forecasts.  Once you do that, the need for macro language likely disappears.

array forecast {60} forecast1-forecast60;

Then:

forecast (i ) = spend (i-1) / shape factor (i-1) * shape factor i  

You may not utilize forecast1, but it's easier to leave it in the program.

stlimpbizkit
Calcite | Level 5

Thanks for continuing to help with this case after pointing me to the right direction in part 1.

I actually did what you suggested - creating new series of fields "forecasted_spend_month_n". (Stitching actual and forecasted spend together into one string for further analysis will be another thing I need to figure out at a later point Smiley Happy). In this post, my challenge relies within the code.

forecast (i ) = spend (i-1) / shape factor (i-1) * shape factor i ,     this formula is somewhat conceptual to capture the intent. I was not able to write out the code that works, as in - When I uses "spend (i-1)" in the code, SAS wouldn't understand it is the trailing month spend that I am referring to. I tried introducing a second variable j = i-1 so that I can use j instead of i-1, but that hasn't been successful for me either.

The crux of my issue here is that (i ) and (i-1) are part of field name, and I am looking for ways to be able to refer to them in my formula respectively.

Astounding
PROC Star

It's a little hard to see what you have actually done, but SAS definitely understands that spend(i-1) refers to the previous month.  The only time you would run into trouble with that is when i=1 and there is no previous month.  The error message would say something like "Array subscript out of range."

stlimpbizkit
Calcite | Level 5

I tried referring to the trailing month spend by using "spend(i-1)" or "spend(&i-1)", and got error message saying "The function SPEND is unknown, or cannot be accessed", which leads me to believe that SAS does not recognize "spend" as part of field name if I add any calculation upon i..

ballardw
Super User

Have you created an array of variables named spend? If not then SAS has to guess what "spend" is and the way you are using it makes SAS think it might be a function.

Time to show the actual code you are attempting to run.

stlimpbizkit
Calcite | Level 5

Thanks that makes sense. I haven't created an array named spend and hence the error message. Spend data are in the form of available columns from existing dataset, and so is shape:

    • spend data - as the example shown in the post (actual dataset has more records and more months of spend performance. Also there is segment field that I used to append shape later on)
    • shape data - has columns for shape factors by month up to 60 months that is to be covered.

Next, I merged spend data with shape data by segment. The merged dataset is the input dataset for the question posted.

%let count=60;

data output;

set input;

array spend (60);

do i = months_since_opened to &count;

spend (i ) = spend (i-1) / shape factor (i-1) * shape factor i

end;

run;

By the way I wasn't trying to hide my code since I am the one looking for suggestions.  :smileycool:  Just trying to simplify the case without losing necessary info. Besides what I have pasted here, the rest are a few data steps to prepare the input as described above.

Reeza
Super User

Where's the declaration for the shape array? If I understand correctly its in a separate dataset so need to merge it in somehow as well. If its a set of constants I'd consider a temporary array.

stlimpbizkit
Calcite | Level 5

For shape, I am actually working with dummy numbers right now to sort out the SAS code first. To come up with the right shape by segment, that is a separate piece of valuation analysis by itself. To be further transparent, I created the dummy shape like this (literally "dummy" method):

DATA SHAPE;

   INPUT NAME $ shape1 shape2 shape3 shape4 shape5 shape6 shape7 shape8 shape9 shape10 shape11 shape12 shape13 shape14 shape15 shape16 shape17 shape18 shape19 shape20 shape21 shape22 shape23 shape24;

   DATALINES;

          spend_shape  0.2 0.56 0.65 0.73 0.80 0.86 0.91 0.95 0.98 1.00 1.02 1.04 1.06 1.08 1.10 1.11 1.12 1.13 1.14 1.15 1.16 1.17 1.18 1.19

;

Reeza
Super User

And how are you planning to pass those values to the data step indicated in your previous code where you seen to be referring to a SHAPE array already?

stlimpbizkit
Calcite | Level 5

Those shape values are stored in dataset "shape" which is merged with "spend" and combined into "input".

I don't have anything between merging shape & spend and the piece of code with blue line above . Maybe certain steps are needed before I can create the array with relative reference (much like offset function in excel)?  

Reeza
Super User

You still haven't declared the shape array then. And for the rest, I don't understand the problem anymore.

Reeza
Super User

Perhaps something like the following (untested):


%let count=60;

data output;

set input;

array spend (60);

array shape_factor(24) _temporary_ ( 0.2 0.56 0.65 0.73 0.80 0.86 0.91 0.95 0.98 1.00 1.02 1.04 1.06 1.08 1.10 1.11 1.12 1.13 1.14 1.15 1.16 1.17 1.18 1.19);

do i = months_since_opened to &count;

spend (i ) = spend (i-1) / shape_factor (i-1) * shape_factor( i )

end;

run;

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 20 replies
  • 3798 views
  • 6 likes
  • 4 in conversation