Dear Madam/Sir,
I would like to construct standard deviation of ROA with a rolling window of the past five years.
After referring several comments in this forum, I made the following command and had the error messages (attached).
proc sql;
create table m24 as select *, (std(a.roa) from have where a.fyear between
a.fyear-6 and a.fyear-1 and a.gvkey) from m23 as a;
quit;
Obs | gvkey | FYEAR | roa |
1 | 1004 | 1999 | 0.04745 |
2 | 1004 | 2000 | 0.02640 |
3 | 1004 | 2001 | -0.08299 |
4 | 1004 | 2002 | -0.01807 |
5 | 1004 | 2003 | 0.00494 |
6 | 1004 | 2004 | 0.02110 |
7 | 1004 | 2005 | 0.03592 |
8 | 1004 | 2006 | 0.05494 |
9 | 1004 | 2007 | 0.05517 |
10 | 1004 | 2008 | 0.05710 |
11 | 1004 | 2009 | 0.02973 |
12 | 1004 | 2010 | 0.04098 |
13 | 1004 | 2011 | 0.03084 |
14 | 1004 | 2012 | 0.02574 |
15 | 1004 | 2013 | 0.03314 |
16 | 1004 | 2014 | 0.00673 |
17 | 1004 | 2015 | 0.03308 |
18 | 1004 | 2016 | 0.03756 |
19 | 1013 | 2000 | 0.21864 |
20 | 1013 | 2001 | -0.51514 |
21 | 1013 | 2002 | -1.00070 |
22 | 1013 | 2003 | -0.05914 |
23 | 1013 | 2004 | 0.01148 |
24 | 1013 | 2005 | 0.07212 |
25 | 1013 | 2006 | 0.04077 |
26 | 1013 | 2007 | 0.06023 |
27 | 1013 | 2008 | -0.02181 |
28 | 1013 | 2009 | -0.35301 |
29 | 1013 | 2010 | 0.04205 |
The error logs are attached.
Any of your help will be highly appreciated.
Sincerely,
Joon
@joon1 The following approach is simpler and faster
data have;
input Obs gvkey FYEAR roa;
drop obs;
cards;
1 1004 1999 0.04745
2 1004 2000 0.0264
3 1004 2001 -0.08299
4 1004 2002 -0.01807
5 1004 2003 0.00494
6 1004 2004 0.0211
7 1004 2005 0.03592
8 1004 2006 0.05494
9 1004 2007 0.05517
10 1004 2008 0.0571
11 1004 2009 0.02973
12 1004 2010 0.04098
13 1004 2011 0.03084
14 1004 2012 0.02574
15 1004 2013 0.03314
16 1004 2014 0.00673
17 1004 2015 0.03308
18 1004 2016 0.03756
19 1013 2000 0.21864
20 1013 2001 -0.51514
21 1013 2002 -1.0007
22 1013 2003 -0.05914
23 1013 2004 0.01148
24 1013 2005 0.07212
25 1013 2006 0.04077
26 1013 2007 0.06023
27 1013 2008 -0.02181
28 1013 2009 -0.35301
29 1013 2010 0.04205
;
data want;
do n=1 by 1 until(last.gvkey);
set have;
by gvkey;
array t(9999);
if n>=6 then std_dev=std(of t(*));
t(n)=roa;
if n(of t(*))>5 then do;k=n-5; call missing(t(k));end;
output;
end;
drop k n t:;
run;
Hi @joon1 Can you please post a sample of the expected Output to your sample. This may help somebody test their code against your code and make it reusable for the wider community.
Thanks for your response, novinosrin.
The expected output is attached.
Your help will be highly appreciated.
Joon1
Hi @joon1 See if this works
data have;
input Obs gvkey FYEAR roa;
drop obs;
cards;
1 1004 1999 0.04745
2 1004 2000 0.0264
3 1004 2001 -0.08299
4 1004 2002 -0.01807
5 1004 2003 0.00494
6 1004 2004 0.0211
7 1004 2005 0.03592
8 1004 2006 0.05494
9 1004 2007 0.05517
10 1004 2008 0.0571
11 1004 2009 0.02973
12 1004 2010 0.04098
13 1004 2011 0.03084
14 1004 2012 0.02574
15 1004 2013 0.03314
16 1004 2014 0.00673
17 1004 2015 0.03308
18 1004 2016 0.03756
19 1013 2000 0.21864
20 1013 2001 -0.51514
21 1013 2002 -1.0007
22 1013 2003 -0.05914
23 1013 2004 0.01148
24 1013 2005 0.07212
25 1013 2006 0.04077
26 1013 2007 0.06023
27 1013 2008 -0.02181
28 1013 2009 -0.35301
29 1013 2010 0.04205
;
proc sql;
create table want as
select gvkey,Fyear,roa,ifn(n=6,std(t),.) as standard_dev
from
(select a.*,ifn(max(b.fyear)=b.fyear,.,b.roa) as t,n(a.fyear) as n
from have a inner join have b
on a.gvkey=b.gvkey and a.fyear-5<= b.FYEAR<= a.fyear
group by a.gvkey,a.fyear,a.roa)
group by gvkey,fyear,roa
order by gvkey,fyear;
quit;
Thank you so much, novinosrin. I have the following error log.
115 proc sql;
116 create table want as
117 select gvkey,Fyear,roa,ifn(n=6,std(t),.) as roastd
118 from
119 (select a.*,ifn(max(b.fyear)=b.fyear,.,b.roa) as t,n(a.fyear) as n /*std(calculated t) as
119! roastd*/
120 from have a inner join have b
121 on a.gvkey=b.gvkey and a.fyear-5<= b.FYEAR<= a.fyear
122 group by a.gvkey,a.fyear,a.roa)
123 group by gvkey,fyear,roa
124 order by gvkey,fyear;
ERROR: File WORK.HAVE.DATA does not exist.
ERROR: File WORK.HAVE.DATA does not exist.
It will be highly appreciative if you can guide me to fix this error.
Thanks
Joon
HAVE is the sample I created using your sample. Please review my post carefully
@joon1 The following approach is simpler and faster
data have;
input Obs gvkey FYEAR roa;
drop obs;
cards;
1 1004 1999 0.04745
2 1004 2000 0.0264
3 1004 2001 -0.08299
4 1004 2002 -0.01807
5 1004 2003 0.00494
6 1004 2004 0.0211
7 1004 2005 0.03592
8 1004 2006 0.05494
9 1004 2007 0.05517
10 1004 2008 0.0571
11 1004 2009 0.02973
12 1004 2010 0.04098
13 1004 2011 0.03084
14 1004 2012 0.02574
15 1004 2013 0.03314
16 1004 2014 0.00673
17 1004 2015 0.03308
18 1004 2016 0.03756
19 1013 2000 0.21864
20 1013 2001 -0.51514
21 1013 2002 -1.0007
22 1013 2003 -0.05914
23 1013 2004 0.01148
24 1013 2005 0.07212
25 1013 2006 0.04077
26 1013 2007 0.06023
27 1013 2008 -0.02181
28 1013 2009 -0.35301
29 1013 2010 0.04205
;
data want;
do n=1 by 1 until(last.gvkey);
set have;
by gvkey;
array t(9999);
if n>=6 then std_dev=std(of t(*));
t(n)=roa;
if n(of t(*))>5 then do;k=n-5; call missing(t(k));end;
output;
end;
drop k n t:;
run;
Thank you so much, novinosrin. I greatly appreciate your help.
Have a great day
Joon1
data have;
infile cards expandtabs truncover;
input Obs gvkey FYEAR roa;
drop obs;
cards;
1 1004 1999 0.04745
2 1004 2000 0.0264
3 1004 2001 -0.08299
4 1004 2002 -0.01807
5 1004 2003 0.00494
6 1004 2004 0.0211
7 1004 2005 0.03592
8 1004 2006 0.05494
9 1004 2007 0.05517
10 1004 2008 0.0571
11 1004 2009 0.02973
12 1004 2010 0.04098
13 1004 2011 0.03084
14 1004 2012 0.02574
15 1004 2013 0.03314
16 1004 2014 0.00673
17 1004 2015 0.03308
18 1004 2016 0.03756
19 1013 2000 0.21864
20 1013 2001 -0.51514
21 1013 2002 -1.0007
22 1013 2003 -0.05914
23 1013 2004 0.01148
24 1013 2005 0.07212
25 1013 2006 0.04077
26 1013 2007 0.06023
27 1013 2008 -0.02181
28 1013 2009 -0.35301
29 1013 2010 0.04205
;
proc sql;
create table m24 as
select *, (select std(roa) from have where fyear between
a.fyear-6 and a.fyear-1 and gvkey=a.gvkey)
from have as a;
quit;
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.