I would like to calculate standard deviation of a variable for the recent five years. Proc Expand works pretty well, but not in a perfect way I expect it would.
data prdsale; set sashelp.PrdSale; run; proc sort data= prdsale nodupkey; by region division year quarter; run; proc expand data= prdsale out= prdsale; by region division; convert actual= StdDev_actual/transformout= (MovStd 5); run; proc print ; var region division actual StdDev_Actual; run;
The above code calculates the standard deviation of the variable, actual, using the most recent five data point. But, it also calculates Std.Dev. when there are less than 5 data point. For example, the third observation in the data has the data point in time t, t-1, and t-2 (but not in time t-3 and t-4). It still calculates the Std.Dev. of three data points. Can I just calculate it only when there are strictly five data point?
Add the TRIM and NOMISS options to your CONVERT statement.
See the explanation in Trick 4 in this paper or check the documentation.
https://support.sas.com/resources/papers/proceedings10/093-2010.pdf
NOMISS options, from documentation link above
The NOMISS operator does not perform any calculations, but serves to modify the operation of the moving time window operator that follows it. The NOMISS operator has no effect unless it is followed by a moving time window operator.
For example, the following statement computes a five-period moving average of the variable
X
but produces a missing value when any of the five values are missing:convert x=y / transformout=( nomiss movave 5 );The following statement computes the cumulative sum of the variable
X
but produces a missing value for all periods after the first missingX
value:convert x=y / transformout=( nomiss cusum );
EDITed to add NOMISS option.
Just eliminate the first 4 records from the output data set. In that case, you would be much better off giving the output data set a different name than the input data set.
So what result do you want when there is not 5 data point (i.e. at least one of the previous 5 are missing)?
Add the TRIM and NOMISS options to your CONVERT statement.
See the explanation in Trick 4 in this paper or check the documentation.
https://support.sas.com/resources/papers/proceedings10/093-2010.pdf
NOMISS options, from documentation link above
The NOMISS operator does not perform any calculations, but serves to modify the operation of the moving time window operator that follows it. The NOMISS operator has no effect unless it is followed by a moving time window operator.
For example, the following statement computes a five-period moving average of the variable
X
but produces a missing value when any of the five values are missing:convert x=y / transformout=( nomiss movave 5 );The following statement computes the cumulative sum of the variable
X
but produces a missing value for all periods after the first missingX
value:convert x=y / transformout=( nomiss cusum );
EDITed to add NOMISS option.
For those who will see this post:
1. trimleft doesn' handle missing values well.
2. nomiss should be used for handling missing values.
SQL can solve it easily .
data air;
set sashelp.air;
t+1;
run;
proc sql;
create table want as
select *,case
when (select count(*) from air where t between a.t-4 and a.t)<5 then .
else (select std(air) from air where t between a.t-4 and a.t)
end as rolling_std
from air as a;
quit;
data prdsale;
set sashelp.PrdSale;
run;
proc sort data= prdsale nodupkey;
by region division year quarter;
run;
data air;
set prdsale;
by region division;
if first.division then t=0;
t+1;
run;
proc sql;
create table want as
select *,case
when (select count(*) from air
where region=a.region and division=a.division and t between a.t-4 and a.t)<5 then .
else (select std(actual) from air where
region=a.region and division=a.division and t between a.t-4 and a.t)
end as rolling_std
from air as a;
quit;
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.