BookmarkSubscribeRSS Feed
proctice
Quartz | Level 8

I used Proc Expand and averaged averages.

 

My colleague used a data step and divided moving numerator sums and denominator sums.  

 

We got different answers due to different weighting.  

 

 

Which is the correct way to calculate a moving average?

6 REPLIES 6
Reeza
Super User

Both are valid approaches and you should get the same answers.

But if you're not using the same methodology then they won't be equal, which may be the actual issue.

 

If your data has weights and you want to factor those in I don't believe PROC EXPAND supports weights the same way.

You would need to specify the weights in the CONVERT statement. Numbers in purpler are the weights.

   convert x=y / transformout=( cmovave( .1 .2 .4 .2 .1 ) );

 

We don't have enough information at this point to say one's correct over another without seeing any data or code.

 


@proctice wrote:

I used Proc Expand and averaged averages.

 

My colleague used a data step and divided moving numerator sums and denominator sums.  

 

We got different answers due to different weighting.  

 

 

Which is the correct way to calculate a moving average?


 

 

proctice
Quartz | Level 8

We don't use weights, but we wouldn't get the same answer if one time period had many more cases than another time period. The sum method would naturally give more weight to the time period with more cases and the average of averages method would give them all equal weight.  

 

 

 

Reeza
Super User

@proctice wrote:

We don't use weights, but we wouldn't get the same answer if one time period had many more cases than another time period. The sum method would naturally give more weight to the time period with more cases and the average of averages method would give them all equal weight.  

 

 

 


We don't have enough information at this point to say one's correct over another without seeing any data or code.

proctice
Quartz | Level 8
              Mean of time period
time 1 30 60 90 30 60 50 53.33333
time 2 20 10         15
time 3 5 5 3 2 1   3.2
time 4  12 3 50       21.66667
 

 

 

        mean of means 23.3
            mean 25.4

 

But imagine more time periods and rolling means of the time periods. mean of time 1-4, mean of time 2-5, mean of time 3-6, etc.  

I calculate rolling mean using the mean of the means, colleague calculates using the mean of the values in the time periods.

 

 

Reeza
Super User

Why would you have multiple measurements per time? Are they per person or some other thing? Some type of repeated measurement? Context does decide how it should be handled. For example, depending on what that 'across' variable is, maybe you should be averaging the columns first (or totaling) and then averaging afterwards. 

 

As posted, no, the mean of means here is not what I would consider a good solution because you have multiple measurements per time. 
So depending on how that's defined and the context you may want to summarize each time and then possibly add a weight. 

 

If this is a moving average, you should have more than one value though, ie moving average implies last two values and carried forward or average of the three most recent times moving forward etc. 

 


@proctice wrote:
              Mean of time period
time 1 30 60 90 30 60 50 53.33333
time 2 20 10         15
time 3 5 5 3 2 1   3.2
time 4  12 3 50       21.66667
 

 

 

        mean of means 23.3
            mean 25.4

 


 

 

proctice
Quartz | Level 8
Columns are individual people, number of people varies each time period.

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

Multiple Linear Regression in SAS

Learn how to run multiple linear regression models with and without interactions, presented by SAS user Alex Chaplin.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 6 replies
  • 846 views
  • 0 likes
  • 2 in conversation