BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
ebowen
Quartz | Level 8

I'm trying to do a moving average with PROC EXPAND where the average takes into account only those observations that occur within the previous seven calendar days, which is not necessarily the previous seven observations. I have a dataset that has some gaps, such as weekends, holidays, etc., where the data is not reported. But I don't want to average observations that are outside the weekly window. In other words, if I'm calculating the moving average on a Monday, I want the average to be only those observations from the previous Tuesday to the current day, regardless of whether there are observations for the weekend.

 

Currently PROC EXPAND is taking seven previous observations, which can average in numbers from a few days before that. Here's my code:

 

proc expand DATA=gas.gasprices OUT=gas.GasPrices_ma METHOD=none;
ID Date;
convert spread=Spread_MA / TRANSFORMOUT = (nomiss movave 7 trim 6);
run;

 

1 ACCEPTED SOLUTION

Accepted Solutions
Reeza
Super User

Use PROC TIMESERIES to first fill in the gaps with missing and then use PROC EXPAND to do the moving averages. 

 

You could also use a data step if so inclined, but this method is easier as it will scale to month/years/weeks as necessary. 

Here's a fully worked example that shows how this is done at the monthly level, you will need to change the INTERVAL to day for your case.

 

/*this is an example of time series data. 
1. Create a time series data set with missing intervals (IBM)
2. Add back missing entries using PROC TIMESERIES (IBM_NO_MISSING)
3. Calculate moving average - 12 month average
*/

/*1*/
data ibm;
	set sashelp.stocks;
	where stock='IBM';

	if month(date)=7 then
		delete;
run;

proc sort data=ibm;
	by date;
run;

/*2*/
proc timeseries data=ibm out=ibm_no_missing;
	id date interval=month start='01Aug1986'd end='01Dec2005'd;
	var open;
run;

/*3*/
proc expand data=ibm_no_missing out=want;
	id date;
	convert open = open_12 / method=none transformout= (movave 12);
run;

@ebowen wrote:

I'm trying to do a moving average with PROC EXPAND where the average takes into account only those observations that occur within the previous seven calendar days, which is not necessarily the previous seven observations. I have a dataset that has some gaps, such as weekends, holidays, etc., where the data is not reported. But I don't want to average observations that are outside the weekly window. In other words, if I'm calculating the moving average on a Monday, I want the average to be only those observations from the previous Tuesday to the current day, regardless of whether there are observations for the weekend.

 

Currently PROC EXPAND is taking seven previous observations, which can average in numbers from a few days before that. Here's my code, data attached:

 

proc expand DATA=gas.gasprices OUT=gas.GasPrices_ma METHOD=none;
ID Date;
convert spread=Spread_MA / TRANSFORMOUT = (nomiss movave 7 trim 6);
run;



View solution in original post

7 REPLIES 7
Reeza
Super User

Use PROC TIMESERIES to first fill in the gaps with missing and then use PROC EXPAND to do the moving averages. 

 

You could also use a data step if so inclined, but this method is easier as it will scale to month/years/weeks as necessary. 

Here's a fully worked example that shows how this is done at the monthly level, you will need to change the INTERVAL to day for your case.

 

/*this is an example of time series data. 
1. Create a time series data set with missing intervals (IBM)
2. Add back missing entries using PROC TIMESERIES (IBM_NO_MISSING)
3. Calculate moving average - 12 month average
*/

/*1*/
data ibm;
	set sashelp.stocks;
	where stock='IBM';

	if month(date)=7 then
		delete;
run;

proc sort data=ibm;
	by date;
run;

/*2*/
proc timeseries data=ibm out=ibm_no_missing;
	id date interval=month start='01Aug1986'd end='01Dec2005'd;
	var open;
run;

/*3*/
proc expand data=ibm_no_missing out=want;
	id date;
	convert open = open_12 / method=none transformout= (movave 12);
run;

@ebowen wrote:

I'm trying to do a moving average with PROC EXPAND where the average takes into account only those observations that occur within the previous seven calendar days, which is not necessarily the previous seven observations. I have a dataset that has some gaps, such as weekends, holidays, etc., where the data is not reported. But I don't want to average observations that are outside the weekly window. In other words, if I'm calculating the moving average on a Monday, I want the average to be only those observations from the previous Tuesday to the current day, regardless of whether there are observations for the weekend.

 

Currently PROC EXPAND is taking seven previous observations, which can average in numbers from a few days before that. Here's my code, data attached:

 

proc expand DATA=gas.gasprices OUT=gas.GasPrices_ma METHOD=none;
ID Date;
convert spread=Spread_MA / TRANSFORMOUT = (nomiss movave 7 trim 6);
run;



ebowen
Quartz | Level 8

Thanks Reeza! This solution works pretty well, though it does add in observations for the moving average for dates where the original data doesn't exist. I just cut out those data with a where statement in the out dataset to keep it to the original observed dates.

akshay_solanki
Fluorite | Level 6
Hello All
Can anyone tell me how to do 7 hourly moving average timeseries data?

Using proc expand..

Reeza
Super User
Is your data already hourly?

If so something as simple as this may work:

proc expand data=have out=want method=none;
id datetimevariable;
convert value = value7 / transout = (moveave 7);
run;

https://support.sas.com/documentation/cdl/en/etsug/68148/HTML/default/viewer.htm#etsug_expand_exampl...
akshay_solanki
Fluorite | Level 6
I've a same datetime 30 observation with sequence 1 and datetime format date variable and sequence variable which is group by sequence. I want to do 7 hourly moving average timeseries data and plot by sequence.

And I've missing data also in value.
Reeza
Super User
Please start a new thread with your question and include sample data that reflects your actual data, fake data is fine.
akshay_solanki
Fluorite | Level 6
I just add to=hour and sorting group by variable then it's work very well.

Thank you for your help and support.

😀😇🤩

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 7 replies
  • 2316 views
  • 2 likes
  • 3 in conversation