BookmarkSubscribeRSS Feed
Fer_
Fluorite | Level 6

Hello everyone, 

 

I have dataset which has gvkey, year, and sale. I need to calculate standard deviation pf sale for each firm (identified by gvkey) between year of t to t-4.

For example  standard deviation of sale for 1996 from 1996 to 1995, 1994, 1993, 1992. 

I appreciate your help. 

gvkeyyear sale 
10041992524
10041993765
10041994442
10041995510
10041996324
….….
101919902900
101919913600
101919925624
101919934527
101919946570
101919954298
101919964657
101919975498
101919986790
….
3 REPLIES 3
sbxkoenk
SAS Super FREQ

Hello,

 

You need a moving standard deviation with size of the window equal to 5 (t-4, t-3, t-2, t-1, t).

You can easily do this with PROC EXPAND (SAS/ETS).
Here are some transformations that will be of interest to you :

  • CMOVSTD window Centered moving standard deviation
  • CUSTD (n is optional) Cumulative standard deviation
  • MOVSTD window Backward moving weighted standard deviation:

Try for example this:
Take care! : I am not sure if this will calculate std on (t-4, t-3, t-2, t-1, t) or on (t-5, t-4, t-3, t-2, t-1).
Please check ... if it's not including the current line t (as you want) , I will tell you how to "shift" the window.

proc expand data=sashelp.pricedata out=abc;
 by region line product;
 convert sale = sale_MOVSTD / transformout=(MOVSTD 5);
run;

proc datasets library=work NoList Nodetails memtype=DATA;
 modify abc;
  label sale_MOVSTD = "sale moving std window size = 5";
run;
QUIT;

BR, Koen

mkeintz
PROC Star

@sbxkoenk 

 

You can actually assign the label during the PROC EXPAND, thereby avoiding the need for a subsequent PROC DATASETS.

 

The price?  An unavoidable warning note, as in:

31   proc expand data=sashelp.pricedata out=abc;
32    by region line product;
33    convert sale = sale_MOVSTD / transformout=(MOVSTD 5);
34    label sale_MOVSTD = "sale moving std window size = 5";
WARNING: Variable SALE_MOVSTD not found in data set SASHELP.PRICEDATA.
35   run;
--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
Reeza
Super User
data want;
array p{0:4} _temporary_;
set have; 
by gvkey year;

if first.gvkey then do;
call missing(of p{*});
num_years=0;
end;
num_years+1;

p{mod(_n_,5)} = sale;

*if you only want to calculate this once you have 5 years of data;
if num_years>=5 then sale_std = std(of p(*));

*if not conditional remove the IF portion;
run;

 

I would probably recommend PROC EXPAND as it will deal with missing data and years better, however, here's a rough data step approach. You will need to tailor to deal with missing years/data if necessary.

sas-innovate-wordmark-2025-midnight.png

Register Today!

Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.


Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 3 replies
  • 822 views
  • 3 likes
  • 4 in conversation