BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
joon1
Quartz | Level 8

Dear Madam/Sir,

 

I would like to have standard deviations (aqresstd) on five values (variable name: aqres) for the past four years (t-4,t-3,t-2,t-1) and the current year (t).

I ran the data using the suggested program from this community and there is no error message as follows:

 


data s9;
do n=1 by 1 until(last.gvkey);
set s8;
by gvkey fyear;
array t(50);
if n>=5 then aqresstd=std(of t(*));
t(n)=aqres;
if n(of t(*))>4 then do;k=n-4; call missing(t(k));end;
output;
end;
drop n t:;
run;

NOTE: There were 206263 observations read from the data set WORK.S8.
NOTE: The data set WORK.S9 has 206263 observations and 70 variables.
NOTE: DATA statement used (Total process time):
real time 0.54 seconds
cpu time 0.54 seconds

 

However, I do not have correct standard deviation values as follows;

                                                                                   gvkey        fyear

34567891011121314151617181920212223242526272829303132
001004198929.51.
0010041990-14.73.
001004199110.10.
00100419921.12.
00100419932.6418.456
0010041994-0.5110.439
0010041995-6.024.688
001004199626.903.774
001004199722.2314.544
00100419989.3816.334
001004199925.6514.752
00100420001.078.017
0010042001-67.3911.407
0010042002-24.6641.002
0010042003-3.8539.755
00100420040.2331.183
00100420050.6530.961
00100420063.3012.008
001004200756.962.953
001004200881.3527.817
001004200940.4640.058
0010042010-38.1132.775
001004201148.9051.657
0010042012-8.5550.672
001004201353.3141.223
0010042014-235.4244.676
001004201511.76136.268
0010042016-25.15129.714
0010042017-7.02128.429
0010042018-52.38115.299

 

It will be greatly appreciative if you can advise me how to fix the program. Thanks.

 

Sincerely,

Joon

1 ACCEPTED SOLUTION

Accepted Solutions
mkeintz
PROC Star

If you need a minimum of three observations (why would 3 obs ever be sufficient for a std estimate?), just prefix the aqresstd=std(of t{*}) assignment with "if n(of t{*})>=3 then ", as below.

 

But wait a minute.  Are you saying that you have some years missing?  So my question is

 

  1. Do you have completely missing records for some fiscal years?  (i.e. some FYEARS are skipped)
    or
  2. Do you have all the fiscal years, but for some of those years the aqres variable is missing?

 

In either case, use MOD(FYEAR,5) instead of MOD(_N_,5).  That change wouldn't hurt situation 2, and would be essential for situation 1.

 

But if you have situation 1, then "holes" in the fyear sequence must be set to missing in the corresponding elements of the T array, as in the do loop below:

 

data g1;
  set s8;
  by gvkey ;
  array t {5} _temporary_;
  if first.gvkey then call missing(of t{*});
  do fy=sum(lag(fyear),1) to fyear-1; 
    t{1+mod(fy,5)}=.;
  end;
  t{1+mod(fyear,5)}=aqres;
  if n(of t{*})>3 then aqresstd=std(of t{*});
run;

 

 

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------

View solution in original post

5 REPLIES 5
PaigeMiller
Diamond | Level 26

Please explain what is wrong.

--
Paige Miller
mkeintz
PROC Star

You want rolling 5-year STD of aqres for (current year back to current year-4), where your data (from Compustat) is a series of annual fiscal years within each company id (GVKEY).

 

This is a much simpler approach:

 

data s9;
  set s8;
  by gvkey fyear;
  array t {5} _temporary_;

  if first.gvkey then call missing(of t{*});
  t{1+mod(_n_,5)}=aqres;
  if n(of t{*})=5 then aqresstd=std(of t{*});
run;
--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
joon1
Quartz | Level 8

Thank you so much, mkeintz. Your code worked well.

Another question: how can I put condition that your program works if data for at least three of last five years (t-4,t) are available?
I tried to insert one line in your code, but it does not work as follows:
 

data g1;
set s8;
by gvkey fyear;
array t {5} _temporary_;
if first.gvkey then call missing(of t{*});
 t{1+mod(_n_,5)}=aqres;
 if n(of t{*})>3 then do;
 if n(of t{*})=5 then aqresstd=std(of t{*});
 run;



ERROR 117-185: There was 1 unclosed DO block.

NOTE: The SAS System stopped processing this step because of errors.

 

Any help will be highly appreciated.

Thanks

Joon

 
mkeintz
PROC Star

If you need a minimum of three observations (why would 3 obs ever be sufficient for a std estimate?), just prefix the aqresstd=std(of t{*}) assignment with "if n(of t{*})>=3 then ", as below.

 

But wait a minute.  Are you saying that you have some years missing?  So my question is

 

  1. Do you have completely missing records for some fiscal years?  (i.e. some FYEARS are skipped)
    or
  2. Do you have all the fiscal years, but for some of those years the aqres variable is missing?

 

In either case, use MOD(FYEAR,5) instead of MOD(_N_,5).  That change wouldn't hurt situation 2, and would be essential for situation 1.

 

But if you have situation 1, then "holes" in the fyear sequence must be set to missing in the corresponding elements of the T array, as in the do loop below:

 

data g1;
  set s8;
  by gvkey ;
  array t {5} _temporary_;
  if first.gvkey then call missing(of t{*});
  do fy=sum(lag(fyear),1) to fyear-1; 
    t{1+mod(fy,5)}=.;
  end;
  t{1+mod(fyear,5)}=aqres;
  if n(of t{*})>3 then aqresstd=std(of t{*});
run;

 

 

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
joon1
Quartz | Level 8

Thank you so much, mkeintz. I greatly appreciate it.

Joon1

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 5 replies
  • 1339 views
  • 0 likes
  • 3 in conversation