BookmarkSubscribeRSS Feed
Fer_
Fluorite | Level 6

Hello everyone, 

 

I have dataset which has gvkey, year, and sale. I need to calculate standard deviation pf sale for each firm (identified by gvkey) between year of t to t-4.

For example  standard deviation of sale for 1996 from 1996 to 1995, 1994, 1993, 1992. 

I appreciate your help. 

gvkeyyear sale 
10041992524
10041993765
10041994442
10041995510
10041996324
….….
101919902900
101919913600
101919925624
101919934527
101919946570
101919954298
101919964657
101919975498
101919986790
….
3 REPLIES 3
sbxkoenk
SAS Super FREQ

Hello,

 

You need a moving standard deviation with size of the window equal to 5 (t-4, t-3, t-2, t-1, t).

You can easily do this with PROC EXPAND (SAS/ETS).
Here are some transformations that will be of interest to you :

  • CMOVSTD window Centered moving standard deviation
  • CUSTD (n is optional) Cumulative standard deviation
  • MOVSTD window Backward moving weighted standard deviation:

Try for example this:
Take care! : I am not sure if this will calculate std on (t-4, t-3, t-2, t-1, t) or on (t-5, t-4, t-3, t-2, t-1).
Please check ... if it's not including the current line t (as you want) , I will tell you how to "shift" the window.

proc expand data=sashelp.pricedata out=abc;
 by region line product;
 convert sale = sale_MOVSTD / transformout=(MOVSTD 5);
run;

proc datasets library=work NoList Nodetails memtype=DATA;
 modify abc;
  label sale_MOVSTD = "sale moving std window size = 5";
run;
QUIT;

BR, Koen

mkeintz
Jade | Level 19

@sbxkoenk 

 

You can actually assign the label during the PROC EXPAND, thereby avoiding the need for a subsequent PROC DATASETS.

 

The price?  An unavoidable warning note, as in:

31   proc expand data=sashelp.pricedata out=abc;
32    by region line product;
33    convert sale = sale_MOVSTD / transformout=(MOVSTD 5);
34    label sale_MOVSTD = "sale moving std window size = 5";
WARNING: Variable SALE_MOVSTD not found in data set SASHELP.PRICEDATA.
35   run;
--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
Reeza
Super User
data want;
array p{0:4} _temporary_;
set have; 
by gvkey year;

if first.gvkey then do;
call missing(of p{*});
num_years=0;
end;
num_years+1;

p{mod(_n_,5)} = sale;

*if you only want to calculate this once you have 5 years of data;
if num_years>=5 then sale_std = std(of p(*));

*if not conditional remove the IF portion;
run;

 

I would probably recommend PROC EXPAND as it will deal with missing data and years better, however, here's a rough data step approach. You will need to tailor to deal with missing years/data if necessary.

SAS INNOVATE 2024

Innovate_SAS_Blue.png

Registration is open! SAS is returning to Vegas for an AI and analytics experience like no other! Whether you're an executive, manager, end user or SAS partner, SAS Innovate is designed for everyone on your team. Register for just $495 by 12/31/2023.

If you are interested in speaking, there is still time to submit a session idea. More details are posted on the website. 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Get the $99 certification deal.jpg

 

 

Back in the Classroom!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 3 replies
  • 218 views
  • 3 likes
  • 4 in conversation