I used Proc Expand and averaged averages.
My colleague used a data step and divided moving numerator sums and denominator sums.
We got different answers due to different weighting.
Which is the correct way to calculate a moving average?
Both are valid approaches and you should get the same answers.
But if you're not using the same methodology then they won't be equal, which may be the actual issue.
If your data has weights and you want to factor those in I don't believe PROC EXPAND supports weights the same way.
You would need to specify the weights in the CONVERT statement. Numbers in purpler are the weights.
convert x=y / transformout=( cmovave( .1 .2 .4 .2 .1 ) );
We don't have enough information at this point to say one's correct over another without seeing any data or code.
@proctice wrote:
I used Proc Expand and averaged averages.
My colleague used a data step and divided moving numerator sums and denominator sums.
We got different answers due to different weighting.
Which is the correct way to calculate a moving average?
We don't use weights, but we wouldn't get the same answer if one time period had many more cases than another time period. The sum method would naturally give more weight to the time period with more cases and the average of averages method would give them all equal weight.
@proctice wrote:
We don't use weights, but we wouldn't get the same answer if one time period had many more cases than another time period. The sum method would naturally give more weight to the time period with more cases and the average of averages method would give them all equal weight.
We don't have enough information at this point to say one's correct over another without seeing any data or code.
Mean of time period  
time 1  30  60  90  30  60  50  53.33333 
time 2  20  10  15  
time 3  5  5  3  2  1  3.2  
time 4  12  3  50  21.66667  

mean of means  23.3  
mean  25.4 
But imagine more time periods and rolling means of the time periods. mean of time 14, mean of time 25, mean of time 36, etc.
I calculate rolling mean using the mean of the means, colleague calculates using the mean of the values in the time periods.
Why would you have multiple measurements per time? Are they per person or some other thing? Some type of repeated measurement? Context does decide how it should be handled. For example, depending on what that 'across' variable is, maybe you should be averaging the columns first (or totaling) and then averaging afterwards.
As posted, no, the mean of means here is not what I would consider a good solution because you have multiple measurements per time.
So depending on how that's defined and the context you may want to summarize each time and then possibly add a weight.
If this is a moving average, you should have more than one value though, ie moving average implies last two values and carried forward or average of the three most recent times moving forward etc.
@proctice wrote:
Mean of time period time 1 30 60 90 30 60 50 53.33333 time 2 20 10 15 time 3 5 5 3 2 1 3.2 time 4 12 3 50 21.66667
mean of means 23.3 mean 25.4
Registration is open! SAS is returning to Vegas for an AI and analytics experience like no other! Whether you're an executive, manager, end user or SAS partner, SAS Innovate is designed for everyone on your team. Register for just $495 by 12/31/2023.
If you are interested in speaking, there is still time to submit a session idea. More details are posted on the website.
Learn how to run multiple linear regression models with and without interactions, presented by SAS user Alex Chaplin.
Find more tutorials on the SAS Users YouTube channel.