I have 30 years of profit data, and I want to calculate standard deviation for each year using past 5 years profits. The first year with standard deviation data should be the 6th year. ( I calculate the standard deviation for year 6 to year 30 ).
Please help.
Thanks.
Below code works under the assumption that you don't have missing years.
data want;
array arr_prft {5} 8. _temporary_;
set WORK.ROLLIMG_STANDARD_DEVIATION;
by gvkey year;
length std_prft 8;
retain std_calc_flag 0;
element+1;
if element>dim(arr_prft) then
do;
element=1;
std_calc_flag=1;
end;
arr_prft[element]=prft;
if std_calc_flag then
std_prft=std(of arr_prft
output;
if last.gvkey then
call missing(of _all_);
run;
Since you don't have sample data, I can only give out general suggestions.
1. Use Proc Expand. It has moving standard deviation built-in. it is the easiest way, but it requires license of SAS/ETS.
2. Use Lagn(), in your case, Lag5().
3. Use temporary Array(), presumably the most efficient one?
Upon seeing your sample data, all of these approaches could be finalized.
Haikuo
Hai.Guo
I attached the simplified data, One more thing, I need to do this by gvkey ( company id). I have thousands companies. Please help me with the code. I am not good at array function.
Thanks
Anna
Below code works under the assumption that you don't have missing years.
data want;
array arr_prft {5} 8. _temporary_;
set WORK.ROLLIMG_STANDARD_DEVIATION;
by gvkey year;
length std_prft 8;
retain std_calc_flag 0;
element+1;
if element>dim(arr_prft) then
do;
element=1;
std_calc_flag=1;
end;
arr_prft[element]=prft;
if std_calc_flag then
std_prft=std(of arr_prft
output;
if last.gvkey then
call missing(of _all_);
run;
Patrick,
Thanks. It is very likely I have missing years. If so, what kind of modifications should be?
Anna
As Hai.kuo pointed out SAS/ETS procedures like Proc Expand and Proc Timeseries would do the job for you.
For the code I've posted easiest would be to fix up the time series adding years with missing prices.
data All_Years(keep=gvkey year);
set WORK.ROLLIMG_STANDARD_DEVIATION(keep=gvkey year);
by gvkey year;
lag_year=lag(year);
source_year=year;
if first.gvkey then output;
else
do year=lag_year+1 to source_year;
output;
end;
run;
data want;
array arr_prft {5} 8. _temporary_;
merge WORK.ROLLIMG_STANDARD_DEVIATION All_Years;
by gvkey year;
length std_prft 8;
retain std_calc_flag 0;
element+1;
if element>dim(arr_prft) then
do;
element=1;
std_calc_flag=1;
end;
if std_calc_flag then
std_prft=std(of arr_prft
arr_prft[element]=prft;
output;
if last.gvkey then
call missing(of _all_);
run;
There was a logical error in my code. Please note that the calculation for the standard deviation must come BEFORE assigning the current prft to the array element.
I'm sure Ksharp's answer will solve your problem beautifully.
Patrick,
Thanks. I think it is good enough for me.
Anna
Don't miss out on SAS Innovate - Register now for the FREE Livestream!
Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.