BookmarkSubscribeRSS Feed
SasStatistics
Pyrite | Level 9

I have a dataset that contains the columns: Period, Segment and Sum. 

Period Segment Sum
jan-22 A 100
jan-22 B 50
jan-22 C 40
feb-22 A 80
feb-22 B 60
feb-22 C 30
mar-22 A 130
mar-22 B 110
mar-22 C 100
apr-22 A 200
apr-22 B 150
apr-22 C 40
maj-22 A 100
maj-22 B 80
maj-22 C 60
jun-22 A 200
jun-22 B 150
jun-22 C 40
jul-22 A 500
jul-22 B 100
jul-22 C 50
aug-22 A 400
aug-22 B 200
aug-22 C 100
sep-22 A 200
sep-22 B 100
sep-22 C 30
okt-22 A 400
okt-22 B 300
okt-22 C 200
nov-22 A 300
nov-22 B 100
nov-22 C 50
dec-22 A 40
dec-22 B 20
dec-22 C 10


Now I would like to create a new column that is the moving/rolling sum for each segment over the previous 6 periods. 

In total this would give me the following table: 

Period Segment Sum summary_rolling_6months_per_segment
jan-22 A 100  
jan-22 B 50  
jan-22 C 40  
feb-22 A 80  
feb-22 B 60  
feb-22 C 30  
mar-22 A 130  
mar-22 B 110  
mar-22 C 100  
apr-22 A 200  
apr-22 B 150  
apr-22 C 40  
maj-22 A 100  
maj-22 B 80  
maj-22 C 60  
jun-22 A 200  
jun-22 B 150  
jun-22 C 40  
jul-22 A 500 1310
jul-22 B 100 700
jul-22 C 50 360
aug-22 A 400 1610
aug-22 B 200 850
aug-22 C 100 420
sep-22 A 200 1730
sep-22 B 100 890
sep-22 C 30 420
okt-22 A 400 2000
okt-22 B 300 1080
okt-22 C 200 520
nov-22 A 300 2100
nov-22 B 100 1030
nov-22 C 50 530
dec-22 A 40 2040
dec-22 B 20 970
dec-22 C 10 480

 

To simplify for you when creating the sas table: 

data have;
input Period $ Segment $ Sum $ sum_rolling_6_mths $ 
;
datalines;
jan-22	 A	 100 .
jan-22	 B	 50 .
jan-22	 C	 40 .
feb-22	 A	 80 .
feb-22	 B	 60 .
feb-22	 C   30 . 
mar-22	 A	 130 .
mar-22	 B	 110 .
mar-22	 C   100 . 
apr-22	 A	 200 .
apr-22	 B	 150 .
apr-22	 C   100 .
maj-22	 A	 100 .
maj-22	 B	 80 .
maj-22	 C	 60 . 
jun-22	 A	 200 .
jun-22	 B	 150 .
jun-22	 C	 40 .
jul-22	 A	 500 1310
jul-22	 B	 100 700
jul-22	 C	 50 360
aug-22	 A	 400 1610
aug-22	 B	 200 850
aug-22	 C	 100 420
sep-22	 A	 200 1730
sep-22	 B	 100 890
sep-22	 C	 30 420
okt-22	 A	 400 2000
okt-22	 B	 300 1080
okt-22	 C	 200 520
nov-22	 A	 300 2100
nov-22	 B	 100 1030
nov-22	 C	 50 530
dec-22	 A   40 2040
dec-22	 B	 20 970
dec-22	 C   10 480
;
run;

 

2 REPLIES 2
SasStatistics
Pyrite | Level 9

Answer by @PaigeMiller  [Had to remove original post since accepted the answer by misstake]. 

 

First, do not try to work with dates/months as character strings, because these cannot be sorted. (If you did sort them, then April would be the first month, and I'm guessing you don't want that). Work with numeric date values, as shown below in variable month. Then sort by segment and month, then compute the moving averages using PROC EXPAND (if you have a license for SAS/ETS)

 

data have;
input Period $ Segment $ Sum;
month=input(compress(period,'-'),monyy.);
format month monyy7.;
datalines;
jan-22	 A	 100 .
jan-22	 B	 50 .
jan-22	 C	 40 .
feb-22	 A	 80 .
feb-22	 B	 60 .
feb-22	 C   30 . 
mar-22	 A	 130 .
mar-22	 B	 110 .
mar-22	 C   100 . 
apr-22	 A	 200 .
apr-22	 B	 150 .
apr-22	 C   100 .
may-22	 A	 100 .
may-22	 B	 80 .
may-22	 C	 60 . 
jun-22	 A	 200 .
jun-22	 B	 150 .
jun-22	 C	 40 .
jul-22	 A	 500 1310
jul-22	 B	 100 700
jul-22	 C	 50 360
aug-22	 A	 400 1610
aug-22	 B	 200 850
aug-22	 C	 100 420
sep-22	 A	 200 1730
sep-22	 B	 100 890
sep-22	 C	 30 420
oct-22	 A	 400 2000
oct-22	 B	 300 1080
oct-22	 C	 200 520
nov-22	 A	 300 2100
nov-22	 B	 100 1030
nov-22	 C	 50 530
dec-22	 A   40 2040
dec-22	 B	 20 970
dec-22	 C   10 480
;

proc sort data=have;
    by segment month;
run;
proc expand data=have out=want;
    by segment;
    convert sum=sum_rolling_6_mths / transformout=(movsum 6 trimleft 5);
run;

 

 

 

SasStatistics
Pyrite | Level 9

Answer by @ballardw  [Had to remove the original post due to accepting a solution by misstake]. 

 

Correctly describe the values summed. You are apparently adding the CURRENT to the previous 6 months, not a six month rolling sum.

Does each "month" always have exactly 3 observations?

Why if you want to "sum" a value are you making it character in your input statement?????

 

IF there are exactly 3 records for each month and in segment order with in month the following works WHEN SUM is numeric without needing any other modules.

 

This actually takes advantage of the + operator returning missing values when any are missing. The Lag function will return missing if that number of previous record does not have a value.

Caution: This assumes that the Sum variable is not ever missing. If you have missing values then you need to provide such an example. And better rules for what the rolling sum would be in that case.

data want;
   set have;
   rollingsum = lag3(sum)+
                lag6(sum)+
                lag9(sum)+
                lag12(sum)+
                lag15(sum)+
                lag18(sum)
                +sum
   ;
run;

 

How sure are you that 1310 is correct for the first Jul-22? Your data step has

apr-22	 C   100 .

but your "want" table has which is a different value.

apr-22 C 40  

When I add the data step values for Jan through Jun I get 1780, not 1310.

 

 

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 2 replies
  • 345 views
  • 2 likes
  • 1 in conversation