Dear Madam/Sir,
I would like to have standard deviations (aqresstd) on five values (variable name: aqres) for the past four years (t-4,t-3,t-2,t-1) and the current year (t).
I ran the data using the suggested program from this community and there is no error message as follows:
data s9;
do n=1 by 1 until(last.gvkey);
set s8;
by gvkey fyear;
array t(50);
if n>=5 then aqresstd=std(of t(*));
t(n)=aqres;
if n(of t(*))>4 then do;k=n-4; call missing(t(k));end;
output;
end;
drop n t:;
run;
NOTE: There were 206263 observations read from the data set WORK.S8.
NOTE: The data set WORK.S9 has 206263 observations and 70 variables.
NOTE: DATA statement used (Total process time):
real time 0.54 seconds
cpu time 0.54 seconds
However, I do not have correct standard deviation values as follows;
gvkey fyear
| 001004 | 1989 | 29.51 | . |
| 001004 | 1990 | -14.73 | . |
| 001004 | 1991 | 10.10 | . |
| 001004 | 1992 | 1.12 | . |
| 001004 | 1993 | 2.64 | 18.456 |
| 001004 | 1994 | -0.51 | 10.439 |
| 001004 | 1995 | -6.02 | 4.688 |
| 001004 | 1996 | 26.90 | 3.774 |
| 001004 | 1997 | 22.23 | 14.544 |
| 001004 | 1998 | 9.38 | 16.334 |
| 001004 | 1999 | 25.65 | 14.752 |
| 001004 | 2000 | 1.07 | 8.017 |
| 001004 | 2001 | -67.39 | 11.407 |
| 001004 | 2002 | -24.66 | 41.002 |
| 001004 | 2003 | -3.85 | 39.755 |
| 001004 | 2004 | 0.23 | 31.183 |
| 001004 | 2005 | 0.65 | 30.961 |
| 001004 | 2006 | 3.30 | 12.008 |
| 001004 | 2007 | 56.96 | 2.953 |
| 001004 | 2008 | 81.35 | 27.817 |
| 001004 | 2009 | 40.46 | 40.058 |
| 001004 | 2010 | -38.11 | 32.775 |
| 001004 | 2011 | 48.90 | 51.657 |
| 001004 | 2012 | -8.55 | 50.672 |
| 001004 | 2013 | 53.31 | 41.223 |
| 001004 | 2014 | -235.42 | 44.676 |
| 001004 | 2015 | 11.76 | 136.268 |
| 001004 | 2016 | -25.15 | 129.714 |
| 001004 | 2017 | -7.02 | 128.429 |
| 001004 | 2018 | -52.38 | 115.299 |
It will be greatly appreciative if you can advise me how to fix the program. Thanks.
Sincerely,
Joon
If you need a minimum of three observations (why would 3 obs ever be sufficient for a std estimate?), just prefix the aqresstd=std(of t{*}) assignment with "if n(of t{*})>=3 then ", as below.
But wait a minute. Are you saying that you have some years missing? So my question is
In either case, use MOD(FYEAR,5) instead of MOD(_N_,5). That change wouldn't hurt situation 2, and would be essential for situation 1.
But if you have situation 1, then "holes" in the fyear sequence must be set to missing in the corresponding elements of the T array, as in the do loop below:
data g1;
set s8;
by gvkey ;
array t {5} _temporary_;
if first.gvkey then call missing(of t{*});
do fy=sum(lag(fyear),1) to fyear-1;
t{1+mod(fy,5)}=.;
end;
t{1+mod(fyear,5)}=aqres;
if n(of t{*})>3 then aqresstd=std(of t{*});
run;
Please explain what is wrong.
You want rolling 5-year STD of aqres for (current year back to current year-4), where your data (from Compustat) is a series of annual fiscal years within each company id (GVKEY).
This is a much simpler approach:
data s9;
set s8;
by gvkey fyear;
array t {5} _temporary_;
if first.gvkey then call missing(of t{*});
t{1+mod(_n_,5)}=aqres;
if n(of t{*})=5 then aqresstd=std(of t{*});
run;
Thank you so much, mkeintz. Your code worked well.
data g1;
set s8;
by gvkey fyear;
array t {5} _temporary_;
if first.gvkey then call missing(of t{*});
t{1+mod(_n_,5)}=aqres;
if n(of t{*})>3 then do;
if n(of t{*})=5 then aqresstd=std(of t{*});
run;
ERROR 117-185: There was 1 unclosed DO block.
NOTE: The SAS System stopped processing this step because of errors.
Any help will be highly appreciated.
Thanks
Joon
If you need a minimum of three observations (why would 3 obs ever be sufficient for a std estimate?), just prefix the aqresstd=std(of t{*}) assignment with "if n(of t{*})>=3 then ", as below.
But wait a minute. Are you saying that you have some years missing? So my question is
In either case, use MOD(FYEAR,5) instead of MOD(_N_,5). That change wouldn't hurt situation 2, and would be essential for situation 1.
But if you have situation 1, then "holes" in the fyear sequence must be set to missing in the corresponding elements of the T array, as in the do loop below:
data g1;
set s8;
by gvkey ;
array t {5} _temporary_;
if first.gvkey then call missing(of t{*});
do fy=sum(lag(fyear),1) to fyear-1;
t{1+mod(fy,5)}=.;
end;
t{1+mod(fyear,5)}=aqres;
if n(of t{*})>3 then aqresstd=std(of t{*});
run;
Thank you so much, mkeintz. I greatly appreciate it.
Joon1
It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.