BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
mkeintz
PROC Star

LB and UB are intended to be in number form. Since ARRAY statements do not accept date literals as array bounds, you have to convert SAS dates into corresponding integer values.

 

But it turns out the error is a missing semi-colon prior to the ARRAY statement - 95% of my error are missing semi-colons.

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
ilikesas
Barite | Level 11

hi mkeintz,

 

I ran the code and got the following error message:

 

243  %let LB=%sysfunc(intnx(year,%sysfunc(inputn(&begdate,date9.)),-4,sameday));
244  %let UB=%sysfunc(inputn(&enddate,date9.));
245  data want (drop=_:);
246    set have (keep=id date x);
247    by id;
248
249    retain _rolling_ss   /* Rolling sum of squares*/
250           _rolling_sum  /* Rolling sum */
251           _rolling_n    /* Rolling count */         ;
252
253    array history {&lb:&ub} _temporary_;
254
255    if first.id then call missing(of history{*});
256
257    history{date}=x;
258
259    if x^=. then do;
260      _rolling_n+ 1;
261      _rolling_sum+ x;
262      _rolling_ss+ x**2;
263    end;
264
265    _most_recent_drop_date=intnx('year',date,-4,'sameday');
266    _prior_most_recent_dd=lag(_most_recent_drop_date);
267
268    if first.id=0 then do _drop_date=_prior_most_recent_dd+1 to _most_recent_drop_date;
269      if history{_drop_date} ^= . then do;
270        _rolling_n+ (-1);
271        _rolling_sum+ (-history{_drop_date});
272        _rolling_ss+ (-history{_drop_date}**2);
273      end;
274    end;
275
276    _rolling_mean=_rolling_sum/_rolling_n;
277    rolling_std=sqrt(  ( _rolling_ss - _rolling_n*(_rolling_mean**2))/(_rolling_n-1)
277!  );
278  run;

NOTE: Division by zero detected at line 277 column 68.
DATE=02JAN1997 id=85208J10 x=4.625 FIRST.id=1 LAST.id=0 _rolling_ss=21.390625 _rolling_sum=4.625
_rolling_n=1 _most_recent_drop_date=12055 _prior_most_recent_dd=. _drop_date=. _rolling_mean=4.625
rolling_std=. _ERROR_=1 _N_=1
ERROR: Array subscript out of range at line 269 column 8.
DATE=03JAN1997 id=85208J10 x=4.59375 FIRST.id=0 LAST.id=0 _rolling_ss=42.493164063
_rolling_sum=9.21875 _rolling_n=2 _most_recent_drop_date=12056 _prior_most_recent_dd=12055
_drop_date=12056 _rolling_mean=. rolling_std=. _ERROR_=1 _N_=2
NOTE: Missing values were generated as a result of performing an operation on missing values.
      Each place is given by: (Number of times) at (Line):(Column).
      1 at 277:15
NOTE: Mathematical operations could not be performed at the following places. The results of the
      operations have been set to missing values.
      Each place is given by: (Number of times) at (Line):(Column).
      1 at 277:68
NOTE: The SAS System stopped processing this step because of errors.
NOTE: There were 3 observations read from the data set WORK.HAVE.
WARNING: The data set WORK.WANT may be incomplete.  When this step was stopped there were 1
         observations and 4 variables.
WARNING: Data set WORK.WANT was not replaced because this step was stopped.
NOTE: DATA statement used (Total process time):
      real time           0.01 seconds
      cpu time            0.03 seconds

Now there is only one error message: the array subscript is out of range, and in some places it says that there was a division by zero, so I don't know to what extent it has an influence.

 

At the end got the table want having only 1 observation, with stdev being missing.

 

Thnaks! 

mkeintz
PROC Star

I said the code was untested, and you're revealing why.

 

Everything you need to know is in the log.

 

Division by zero.  This program generates rolling STD for up-to-four-years of daily data, and does not set a minimum window size.  As a result it attempts to compute an STD even when you are starting a new id, i.e. when the window has only one observation (_rolling_n=1).  Since line 277 has (_rolling_n-1) in the denominator you'r dividing be zero.  You might want to generate an IF test to require a minimum _rolling_n value in line 277.

 

The array subscript out of bounds occurs at _N_=2.  This means _drop_date is either below the lower bound or above the upper bound of your array.  So look at the value of drop_date, figure out what it is as a sas date value, and determine its relation to macrovar LB and UB.

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
ilikesas
Barite | Level 11

I did %let begdate=02jan1997;  since it is the earliest date in the sample. I ontained the stdevs for all the observations and exported to Excel to to manual verifications and observed the following: For the first id I obtain the same stdevs as the ones imported from SAS, but when the second id begins, it's stdev calculation takes into account the x's from the previous id - but why does this happen if in the code did  if first.id  then call missing(of history{*}); ?

mkeintz
PROC Star

Make sure, that at first.id, all the relevant elements are reset to missing, not just the history array.

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
ilikesas
Barite | Level 11
Ran the code and it seems to be working (note that again I used begdate = 02JAN1997, even though my range of study starts at 02JAN2001). To be sure of the results, I would like to ask you the following: I exported the results into Excel in order to compare the SAS stdevs with the Excel STDEV.S . For the dates 02JAN1997 to 29DEC2000, where the rolling stdev of each date is based on all the dates before it, SAS and Excel have EXACTLY the same stdevs! For dates 02JAN2001 and onward, where the stdev at each date is based on the 4 prior years of data, SAS and Excel give similar results, but here there is some difference, with 1 date having a difference of 9%. Do you have any idea why this is happening? Is there something else that should be added to the code?
mkeintz
PROC Star

You can safely use BEGDATE = 02JAN2001 (i.e. the earliest actual data record), because the lower bound macrovar LB is set to 4 years prior, so any 4-year lookback will not be out of range.

 

As to your question, you will have to examine the cases with a difference, and see whether the  start of the date range for the non-matching STD's are the same.  If not, develop a rule to match them.  I presume the end dates will match.

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 21 replies
  • 2031 views
  • 13 likes
  • 3 in conversation