turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

Find a Community

Topic Options

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

Highlighted
# Problem of Invalid subscript

Options

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

08-18-2017 02:21 PM

** the error shows as below:**

ERROR: (execution) Invalid subscript or subscript out of range.

operation : [ at line 186 column 6

operands : T, t, *LIT1006

**but actually, T is a 120×1 matrix, even I adjust "do t=1 to 12o" to "do t=1 to 2", I do not understand why.**

**and the code:**

```
proc iml;
load N;
load T;
load NT;
B_0=N*(T`);
use ba.t3;
do i=1 to 2320;
do t=1 to 5;
f=N[i,1];
g=T[t,1];
B_0[i,t]=0;
do c=1 to 248064;
if NT[c,1]=f & NT[c,2]=g then do;
read all where(SecCode=f & TDate=g) into S;
XX=0;
XY=0;
do ini=1 to 15;
XX=XX+S[ini,4]*S[ini,4];
XY=XY+S[ini+1,4]*S[ini,4];
end;
b_0=XY/XX;
B_0[i,t]=b_0;
end;
end;
end;
end;
store B_0;
quit;
```

Accepted Solutions

Solution

08-18-2017
11:37 PM

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to supmilk

08-18-2017 04:20 PM

All Replies

Solution

08-18-2017
11:37 PM

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to supmilk

08-18-2017 04:20 PM

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Rick_SAS

08-18-2017 11:20 PM

Thank you for your help. Problem of the subscript solved. But new problem appears: ERROR: Not enough memory to store all matrices.

It seems that I do not have too many matrices in my code. The matrix S is always replaced.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to supmilk

08-19-2017 02:26 PM

I can't tell from what you've shown.

I would advise that you remove the hard-coded values for the loops and replace them with terms such as nrow(N), nrow(T), and nrow(NT), which will make your code easier to read and more portable.

The statements

```
b_0=XY/XX;
B_0[i,t]=b_0;
```

are also wrong b/c you are overwriting b_0, which is the same as B_0.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Rick_SAS

08-20-2017 09:06 AM

Are the matrices B_0 and S actually a lot larger than you think they are? Check by adding the command

`show names;`

at strategic points to obtain a list of matrix names and sizes.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Rick_SAS

08-23-2017 03:26 AM

I don't quite understand what hard-coded values are, would you mind to explain more details on it and how to deal with it by steps? Thanks a lot, I do need your help!

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to supmilk

08-23-2017 05:43 AM

Rick is suggesting that if you write the loop like this:

`do c = 1 to nrow(NT);`

then there is no need to check the size of the matrix NT each time the program is run. Presumably, when you wrote the code above, you had to check that NT had 248064 rows and then hardcode this value in the DO statement.

I am worried about the potential number of iterations for the inner most loop. Depending on the data and how often the IF statement is true, then you may be trying to execute the read statement millions, may be billions of times. This sounds inefficient and it may be better to read the contents of ba.t3 into a matrix at the beginning, and then refer to it rather than the SAS data set.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to IanWakeling

08-23-2017 10:09 AM

Yes! The number of iteration is up to 2320(stocks)×120(days), totally, 248064(some are missing). And in each iteration, I need to read data of each stock in each day from ba.t3 to a matrix S. It is terribly inefficient, that one iteration cost me around 30 seconds and all the iterations will cost me 100 days, but I do not know how to improve it.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to supmilk

08-23-2017 08:09 AM

Since we are offering advice now, I want to point out that the loop to compute XX and XY can be replaced by more efficient statements that use vector multiplication:

`/* to test, create any matrix S that has at least 16 rows and at least 4 coluumns */`

/* replace current code that uses a loop... */
XX=0;
XY=0;
do ini=1 to 15;
XX=XX+S[ini,4]*S[ini,4];
XY=XY+S[ini+1,4]*S[ini,4];
end;
print XX XY;
/* ... with more efficient code that uses a vector inner product */
X = S[1:15,4];
Y = S[2:16,4];
XX = X` * X;
XY = X` * Y;
print XX XY;

Other loops can probably be similarly improved.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to supmilk

08-23-2017 10:15 AM

The way to improve it is to post sample data (maybe 3 stocks x 4 days) including the N, T, and NT matrices.

Also, describe in words what you want the program to do. Then other programmers can make recommendations for improvements.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Rick_SAS

08-24-2017 12:20 PM

Thank you for your suggestions.

table below is a sample of my data for one stock in one day(3×3 or more are too big to list), matrix N is the list of distinct SecCodes, T is the list of distinct TDates, and NT is the list of all the combinations of stock code and date.

The goal of my programming is to calculate an evolutionary AR(1) for every stock in every day. The data I got has 59782508 observations, which cover half year(120 days) and 2320 stocks. So I need to read the matrices from the data for 2320*120 times, which ask for a long time to run. The efficiency of the code is a big problem for me.

```
proc iml;
load N;
load D;
load NT;
load B;
use ba.t3;
do i=2320 to 2320;
do t=120 to 120;
f=N[i,1];
g=D[t,1];
B[i,t]=0;
do c=1 to 248064;
if NT[c,1]=f & NT[c,2]=g then do;
read all where(SecCode=f & TDate=g) into S;
XX=0;
XY=0;
do ini=1 to 15;
XX=XX+S[ini,4]*S[ini,4];
XY=XY+S[ini+1,4]*S[ini,4];
end;
b_0=XY/XX;
B[i,t]=b_0;
end;
end;
end;
end;
store B;
quit;
```

Obs SecCode TDate MinTime StartPrc HighPrc LowPrc EndPrc MinTq MinTm

1 | 20151008 | 930 | 10.85 | 10.85 | 10.85 | 10.85 | 2160486 | 23441273.1 |

1 | 20151008 | 931 | 10.86 | 10.87 | 10.86 | 10.86 | 3495473 | 37987109.2 |

1 | 20151008 | 932 | 10.86 | 10.86 | 10.8 | 10.8 | 987641 | 10693639.26 |

1 | 20151008 | 933 | 10.8 | 10.8 | 10.77 | 10.78 | 889400 | 9598748 |

1 | 20151008 | 934 | 10.78 | 10.8 | 10.78 | 10.8 | 1218735 | 13125513.25 |

1 | 20151008 | 935 | 10.8 | 10.82 | 10.79 | 10.79 | 644800 | 6964091.82 |

1 | 20151008 | 936 | 10.79 | 10.82 | 10.78 | 10.8 | 418278 | 4519166.4 |

1 | 20151008 | 937 | 10.8 | 10.81 | 10.8 | 10.8 | 550422 | 5949659.6 |

1 | 20151008 | 938 | 10.8 | 10.8 | 10.78 | 10.79 | 403600 | 4354584.11 |

1 | 20151008 | 939 | 10.78 | 10.79 | 10.77 | 10.77 | 361400 | 3894113 |

1 | 20151008 | 940 | 10.76 | 10.77 | 10.75 | 10.75 | 578300 | 6221391.16 |

1 | 20151008 | 941 | 10.75 | 10.79 | 10.75 | 10.79 | 755382 | 8129020.25 |

1 | 20151008 | 942 | 10.8 | 10.81 | 10.76 | 10.77 | 862683 | 9301424.35 |

1 | 20151008 | 943 | 10.81 | 10.82 | 10.78 | 10.82 | 465500 | 5031375.8 |

1 | 20151008 | 944 | 10.78 | 10.81 | 10.78 | 10.79 | 119600 | 1291438 |

1 | 20151008 | 945 | 10.78 | 10.79 | 10.77 | 10.77 | 142800 | 1539261 |

1 | 20151008 | 946 | 10.77 | 10.8 | 10.77 | 10.79 | 476800 | 5140467 |

1 | 20151008 | 947 | 10.79 | 10.79 | 10.78 | 10.79 | 185300 | 1998904 |

1 | 20151008 | 948 | 10.79 | 10.79 | 10.78 | 10.79 | 366259 | 3949433.02 |

1 | 20151008 | 949 | 10.79 | 10.8 | 10.78 | 10.79 | 334812 | 3612752.6 |

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to supmilk

08-24-2017 02:18 PM

Well, I can't make sense of what you are telling me and there is no way to run your program, so I give up.

Did you try using an ETS procedure? I am guessing that the best way to do this is to sort by SECCODE and TDATE and then use something like PROC AUTOREG and BY SECCODE TDATE to compute the regression coefficients. You can use the OUTEST= option to output the regression coefficients. Be sure to use ODS EXCLUDE ALL to suppress all the tables/graphs that you don't want. Your code might look somthing like this (untested)

ods exclude all;

proc autoreg data=mydata plots=none outest=ParamEst;

by SecCode TDate;

model StartPrc = / nlags=1; /* ??? */

run;quit;

ods exclude none;

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Rick_SAS

08-24-2017 05:57 PM

I can try to make it clear: I want split my data by SecCode and TDate, then calculate in the splitted data.

The data has been sorted, but still low efficient while i and t are large.

I appreciate your help very much.