BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
Dunne
Obsidian | Level 7

Hello experts,

 

I tried to repeat the codes below 200 times but when I add the do loop, SAS says "array subscript out of range". Could you please show me how to achieve what I want? When I change the seed sometimes it works sometimes it does not.

Many thanks!

 

 

%let nobs=100;

%let nboot=200;

 

data weight_1;

do i=1 to &nboot.;

output;

end;

 

data weight_2;

set weight_1 end=lastobs;
call streaminit (1293);
do n=1 to &nboot.;

array R [&Nobs] _temporary_;

if _N_=1 then do;

do I=1 to &NObs.;

R[I}=rand('gamma',1,1);

end;

 

diff=&Nobs. - sum(of r[*]);

do i=1 to diff;

R[rand('gamma', &Nobs.,1)] + (diff>0);

end;

VET+sum(of R[*]);

end;

ran=r[_N_];

end;

run;

1 ACCEPTED SOLUTION

Accepted Solutions
mkeintz
PROC Star

First, thank you for posting the code in a text box - MUCH easier on my eyes - and it preserves the fixed pitch font in the SAS log - which is occasionally important.

 

I have checked on my previous comment

 

I presume the rand('gamma',&NOBs.,1) returns a continuous result in the (0,100] interval - i.e. greater than zero and up through 100.

 

I should have done this before, because the SAS Functions and Call Routines documentation for the GAMMA distribution just says the the RAND('GAMMA'... function will produce a positive number.  There is no upper limit, so you may very well be generating numbers greater than 100, which could cause the error message you report.

 

So if you have a strategy of a fixed number (100 in your case) of buckets selected based on the RAND('GAMMA' function, you will have to decide whether 100 buckets (i.e. macrovar NOBS=100) is enough, and what to do about results from RAND('GAMMA' that exceed &NOBS.   You could replace the single statement

        R[rand('gamma', &Nobs.,1)] + (diff>0);

with 

 

        J=ceil(rand('gamma', &Nobs.,1));

and then decide whether to add

        R[J] + (diff>0);

depending on the value of J.

 

Of course, it's not at all clear what your goal is, so you may need to take another strategy.

 

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------

View solution in original post

13 REPLIES 13
PaigeMiller
Diamond | Level 26

Whenever you get errors in the log, you need to show us the ENTIRE log for this data step. Please show us the ENTIRE log for this DATA step by copying it as text and then pasting it into the window that appears when you click on the </> icon.

Insert Log Icon in SAS Communities.png

--
Paige Miller
Dunne
Obsidian | Level 7
1                                                          The SAS System                                  10:43 Sunday, May 1, 2022

1          ;*';*";*/;quit;run;
2          OPTIONS PAGENO=MIN;
3          %LET _CLIENTTASKLABEL='Program';
4          %LET _CLIENTPROCESSFLOWNAME='Standalone Not In Project';
5          %LET _CLIENTPROJECTPATH='';
6          %LET _CLIENTPROJECTPATHHOST='';
7          %LET _CLIENTPROJECTNAME='';
8          %LET _SASPROGRAMFILE='';
9          %LET _SASPROGRAMFILEHOST='';
10         
11         ODS _ALL_ CLOSE;
12         OPTIONS DEV=SVG;
13         GOPTIONS XPIXELS=0 YPIXELS=0;
14         %macro HTML5AccessibleGraphSupported;
15             %if %_SAS_VERCOMP(9, 4, 4) >= 0 %then ACCESSIBLE_GRAPH;
16         %mend;
17         FILENAME EGHTML TEMP;
18         ODS HTML5(ID=EGHTML) FILE=EGHTML
19             OPTIONS(BITMAP_MODE='INLINE')
20             %HTML5AccessibleGraphSupported
21             ENCODING='utf-8'
22             STYLE=HtmlBlue
23             NOGTITLE
24             NOGFOOTNOTE
25             GPATH=&sasworklocation
26         ;
NOTE: Writing HTML5(EGHTML) Body file: EGHTML
27         
28         %let nobs=100;
29         
30         %let nboot=200;
31         
32         
33         
34         data weight_1;
35         
36         do i=1 to &nboot.;
37         
38         output;
39         
40         end;
41         
42         
43         

NOTE: The data set WORK.WEIGHT_1 has 200 observations and 1 variables.
NOTE: DATA statement used (Total process time):
      real time           0.01 seconds
      cpu time            0.01 seconds
      

44         data weight_2;
45         
46         set weight_1 end=lastobs;
47         call streaminit (1293);
48         do n=1 to &nboot.;
49         
50         array R [&Nobs] _temporary_;
2                                                          The SAS System                                  10:43 Sunday, May 1, 2022

51         
52         if _N_=1 then do;
53         
54         do I=1 to &NObs.;
55         
56         R[I}=rand('gamma',1,1);
57         
58         end;
59         
60         
61         
62         diff=&Nobs. - sum(of r[*]);
63         
64         do i=1 to diff;
65         
66         R[rand('gamma', &Nobs.,1)] + (diff>0);
67         
68         end;
69         
70         VET+sum(of R[*]);
71         
72         end;
73         
74         ran=r[_N_];
75         
76         end;
77         
78         run;

ERROR: Array subscript out of range at line 66 column 1.
lastobs=0 i=1 n=2 diff=4.7355896866 VET=105.65851235 ran=0.3109976073 _ERROR_=1 _N_=1
NOTE: The SAS System stopped processing this step because of errors.
NOTE: There were 2 observations read from the data set WORK.WEIGHT_1.
WARNING: The data set WORK.WEIGHT_2 may be incomplete.  When this step was stopped there were 0 observations and 5 variables.
NOTE: DATA statement used (Total process time):
      real time           0.10 seconds
      cpu time            0.03 seconds
      

79         
80         %LET _CLIENTTASKLABEL=;
81         %LET _CLIENTPROCESSFLOWNAME=;
82         %LET _CLIENTPROJECTPATH=;
83         %LET _CLIENTPROJECTPATHHOST=;
84         %LET _CLIENTPROJECTNAME=;
85         %LET _SASPROGRAMFILE=;
86         %LET _SASPROGRAMFILEHOST=;
87         
88         ;*';*";*/;quit;run;
89         ODS _ALL_ CLOSE;
90         
91         
92         QUIT; RUN;
93         

Thanks for your suggestion. The log is attached. You can also just run the code to see it because it is simulated data.

PaigeMiller
Diamond | Level 26

Line 66

 

66         R[rand('gamma', &Nobs.,1)] + (diff>0);

You are creating a value from a gamma distribution, which is the index of array R. Values from a gamma distribution are continuous, they are not necessarily integers, and the index to array R must be an integer between 1 and the maximum size of the array. How to fix this? I have no idea, as I don't know what this line is trying to do. Please explain, not only this line, but what your program is trying to do.

--
Paige Miller
Dunne
Obsidian | Level 7

Thank you so much for your quick response.

 

I am trying to do fractional bootstrapping (https://arxiv.org/pdf/1808.08199.pdf ) for rare outcomes. The number of resampling (or number of hits) is not an integer but continuous (we call weight) and the sum of these weights have to equal the sample size and variance = mean

 

For example, I want the expected weight.

N  ID  expected weight

1   1    0.2

1   2    2.2

1   3    0.6

2   1    0.5

2   2    0.8

2   3    1.7

 

 

Dunne_0-1651417923572.pngDunne_1-1651417957834.png

 

PaigeMiller
Diamond | Level 26

What is line 66 supposed to be doing?

--
Paige Miller
mkeintz
PROC Star

The most important response has already been provided to you by @PaigeMiller .

 

To get the best quality help, provide the best quality problem description - in this case a log of your program, which probably tells you exactly which line in your program has the array index out-of-order condition.

 

However, I will take a guess.  You have defined the. array R of 100 elements, with 100 as the upper bound of the array index, and 1 as the lower-bound.  You have references to 

R[I]

but I seems to always be an integer from 1 to 100 - i.e. within bounds.

 

So what about the reference  (with NOBS=100)

        R[rand('gamma', &Nobs.,1)] + (diff>0);

I presume the rand('gamma',&NOBs.,1) returns a continuous result in the (0,100] interval - i.e. greater than zero and up through 100.

 

This means it may occasionally return a value less than 1.0.   Let's say it returns a 0.8.   But SAS will interpret the array reference R[0.8] as R[0] which doesn't exist.   Probably some seeds generate a RAND result less than 1.0 earlier than others, generating the error message your refer to.

 

So be aware that SAS rounds down non-integer values to integers when indexing arrays.  Is that the behavior you want?  For instance, I'm sure you'll never populate the R[100] element, since you can't generate a number greater than 100.  Its probability is virtually zero.

 

 

Edited additional note:   You probably could change

 R[rand('gamma', &Nobs.,1)] + (diff>0);

to

 R[1+rand('gamma', &Nobs.,1)] + (diff>0);

 

which would effectively round UP non-integer RAND results.  This would likely eliminate the occasional out-of-range notes.  But I have no idea whether it will do the task that you intend, or whether it is consistent with your other references to the array.

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
Dunne
Obsidian | Level 7

Thanks for your detailed explanation. I want value  0<"value"<100

mkeintz
PROC Star

@Dunne wrote:

Thanks for your detailed explanation. I want value  0<"value"<100


And isn't that exactly what you are already generating in your RAND('gamma',100,1) expression?

 

It appears that you want to use that function to randomly draw from 100 "buckets" indexed by the array.  Right now you have no bucket for RAND results less than 1,   But you do have 100 buckets for RAND results for  1<=RAND<=100.  But that last bucket (number 100) is mapped only to a RAND result of exactly 100, which I suspect has a "probability mass" of zero.  Its neighbor, bucket number 99, is mapped to RAND results from 99<=RAND<100, with a probability >0

 

In effect the array reference is using the FLOOR function to convert a non-integer result to the integer below it to identify an array element.  Instead you could apply the CEIL (for ceiling) function against the RAND function, as in

 

 R[CEIL(rand('gamma', &Nobs.,1))] + (diff>0);

 

This would map the RAND function to the array index as follows:

 

RAND Results Element number for R[RAND....]
0<RAND<=1 1
1<RAND<=2 2
...  
...  
98<RAND<=99 99
99<RAND<=100 100

 

So you would now be randomly drawing from 100 ordered buckets of equal "size" using a gamma distribution.

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
Dunne
Obsidian | Level 7

Thank you so much!
 It works now if I generate only 1 sample (by deleting line 22 , 42, 43 below). However, it failed to repeat the process &nboot. times (I want to generate &nboot. samples of the entire process). Could you please help me with this as well?

 

Dunne_0-1651443127462.png

 

mkeintz
PROC Star

Instead of pasting an image of the log into your note, could you please capture the text and insert it as fixed width content (use the "</>" icon to open a text box for pasting).    I find it difficult to read the image.

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
Dunne
Obsidian | Level 7
1                                                          The SAS System                                  18:27 Sunday, May 1, 2022

1          ;*';*";*/;quit;run;
2          OPTIONS PAGENO=MIN;
3          %LET _CLIENTTASKLABEL='Program';
4          %LET _CLIENTPROCESSFLOWNAME='Standalone Not In Project';
5          %LET _CLIENTPROJECTPATH='';
6          %LET _CLIENTPROJECTPATHHOST='';
7          %LET _CLIENTPROJECTNAME='';
8          %LET _SASPROGRAMFILE='';
9          %LET _SASPROGRAMFILEHOST='';
10         
11         ODS _ALL_ CLOSE;
12         OPTIONS DEV=SVG;
13         GOPTIONS XPIXELS=0 YPIXELS=0;
14         %macro HTML5AccessibleGraphSupported;
15             %if %_SAS_VERCOMP(9, 4, 4) >= 0 %then ACCESSIBLE_GRAPH;
16         %mend;
17         FILENAME EGHTML TEMP;
18         ODS HTML5(ID=EGHTML) FILE=EGHTML
19             OPTIONS(BITMAP_MODE='INLINE')
20             %HTML5AccessibleGraphSupported
21             ENCODING='utf-8'
22             STYLE=HtmlBlue
23             NOGTITLE
24             NOGFOOTNOTE
25             GPATH=&sasworklocation
26         ;
NOTE: Writing HTML5(EGHTML) Body file: EGHTML
27         
28         %let nobs=100;
29         %let nboot=200;
30         
31         data weight_1;
32         do i=1 to &nobs.;
33         output;
34         end;
35         run;

NOTE: The data set WORK.WEIGHT_1 has 100 observations and 1 variables.
NOTE: DATA statement used (Total process time):
      real time           0.00 seconds
      cpu time            0.00 seconds
      

36         
37         
38         data weight_2;
39         set weight_1 end=lastobs;
40         call streaminit (1293);
41         do n=1 to &nboot.;
42         
43         array R [&nobs.] _temporary_;
44         
45         if _N_=1 then do;
46         
47         do I=1 to (&nobs.);
48         
49         R(I)=rand('gamma',1,1);
50         
2                                                          The SAS System                                  18:27 Sunday, May 1, 2022

51         end;
52         
53         
54         diff=&nobs. - sum(of R[*]);
55         
56         do I=1 to diff;
57         
58         R[ceil(rand('gamma', &nobs.,1))] + (diff>0);
59         
60         end;
61         
62         VET+sum(of R[*]);
63         
64         end;
65         
66         ran=r[_N_];
67         output;
68         end;
69         
70         run;

ERROR: Array subscript out of range at line 58 column 1.
lastobs=0 i=1 n=2 diff=4.7355896866 VET=105.65851235 ran=0.3109976073 _ERROR_=1 _N_=1
NOTE: The SAS System stopped processing this step because of errors.
NOTE: There were 2 observations read from the data set WORK.WEIGHT_1.
WARNING: The data set WORK.WEIGHT_2 may be incomplete.  When this step was stopped there were 1 observations and 5 variables.
WARNING: Data set WORK.WEIGHT_2 was not replaced because this step was stopped.
NOTE: DATA statement used (Total process time):
      real time           0.00 seconds
      cpu time            0.00 seconds
      

71         
72         %LET _CLIENTTASKLABEL=;
73         %LET _CLIENTPROCESSFLOWNAME=;
74         %LET _CLIENTPROJECTPATH=;
75         %LET _CLIENTPROJECTPATHHOST=;
76         %LET _CLIENTPROJECTNAME=;
77         %LET _SASPROGRAMFILE=;
78         %LET _SASPROGRAMFILEHOST=;
79         
80         ;*';*";*/;quit;run;
81         ODS _ALL_ CLOSE;
82         
83         
84         QUIT; RUN;
85         

If I remove line 41, 67, 68, the code will work. But I want to replicate the code &nboot. times. Should I delete "call streaminit(1293)" to have different seeds for each sample?

 

Sorry for my slow response because I want to check my code carefully.

mkeintz
PROC Star

First, thank you for posting the code in a text box - MUCH easier on my eyes - and it preserves the fixed pitch font in the SAS log - which is occasionally important.

 

I have checked on my previous comment

 

I presume the rand('gamma',&NOBs.,1) returns a continuous result in the (0,100] interval - i.e. greater than zero and up through 100.

 

I should have done this before, because the SAS Functions and Call Routines documentation for the GAMMA distribution just says the the RAND('GAMMA'... function will produce a positive number.  There is no upper limit, so you may very well be generating numbers greater than 100, which could cause the error message you report.

 

So if you have a strategy of a fixed number (100 in your case) of buckets selected based on the RAND('GAMMA' function, you will have to decide whether 100 buckets (i.e. macrovar NOBS=100) is enough, and what to do about results from RAND('GAMMA' that exceed &NOBS.   You could replace the single statement

        R[rand('gamma', &Nobs.,1)] + (diff>0);

with 

 

        J=ceil(rand('gamma', &Nobs.,1));

and then decide whether to add

        R[J] + (diff>0);

depending on the value of J.

 

Of course, it's not at all clear what your goal is, so you may need to take another strategy.

 

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
Dunne
Obsidian | Level 7

If I want to generate not integer numbers but continuous numbers>0, how should I modify the below? I did not familiar with simulation but tried to re-use someone's code online (so I don't really understand all the codes). Many thanks!

R[I]

 

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 13 replies
  • 1655 views
  • 0 likes
  • 3 in conversation