Solved: Problem of Invalid subscript

supmilk · Posted 08-18-2017 02:21 PM

the error shows as below:

ERROR: (execution) Invalid subscript or subscript out of range.

operation : [ at line 186 column 6
operands : T, t, *LIT1006

but actually, T is a 120×1 matrix, even I adjust "do t=1 to 12o" to "do t=1 to 2", I do not understand why.

and the code:

proc iml;
load N;
load T;
load NT;
B_0=N*(T`);
use ba.t3;
do i=1 to 2320;
  do t=1 to 5;
  f=N[i,1];
  g=T[t,1];
  B_0[i,t]=0;
    do c=1 to 248064;
    if NT[c,1]=f & NT[c,2]=g then do;
      read all where(SecCode=f & TDate=g) into S;
      XX=0;
      XY=0;
        do ini=1 to 15;
          XX=XX+S[ini,4]*S[ini,4];
	      XY=XY+S[ini+1,4]*S[ini,4];
        end;
      b_0=XY/XX;
      B_0[i,t]=b_0;
	end;
  end;
  end;
end;
store B_0;
quit;

Rick_SAS · Posted 08-18-2017 04:20 PM

SAS is not case sensitive. 'T' and 't' represent the same variable name. Thus you load a matrix named T, but you reassign that matrix when you use "do t = 1 to 5". Change the iterator from 't' to 'k' andupdate all the subscripts that are currently using 't'.

View solution in original post

Rick_SAS · Posted 08-18-2017 04:20 PM

SAS is not case sensitive. 'T' and 't' represent the same variable name. Thus you load a matrix named T, but you reassign that matrix when you use "do t = 1 to 5". Change the iterator from 't' to 'k' andupdate all the subscripts that are currently using 't'.

supmilk · Posted 08-18-2017 11:20 PM

Thank you for your help. Problem of the subscript solved. But new problem appears: ERROR: Not enough memory to store all matrices.

It seems that I do not have too many matrices in my code. The matrix S is always replaced.

Rick_SAS · Posted 08-19-2017 02:26 PM

I can't tell from what you've shown.

I would advise that you remove the hard-coded values for the loops and replace them with terms such as nrow(N), nrow(T), and nrow(NT), which will make your code easier to read and more portable.

The statements

     b_0=XY/XX;
      B_0[i,t]=b_0;

are also wrong b/c you are overwriting b_0, which is the same as B_0.

IanWakeling · Posted 08-20-2017 09:06 AM

Are the matrices B_0 and S actually a lot larger than you think they are? Check by adding the command

show names;

at strategic points to obtain a list of matrix names and sizes.

supmilk · Posted 08-23-2017 03:26 AM

I don't quite understand what hard-coded values are, would you mind to explain more details on it and how to deal with it by steps? Thanks a lot, I do need your help!

IanWakeling · Posted 08-23-2017 05:43 AM

Rick is suggesting that if you write the loop like this:

do c = 1 to nrow(NT);

then there is no need to check the size of the matrix NT each time the program is run. Presumably, when you wrote the code above, you had to check that NT had 248064 rows and then hardcode this value in the DO statement.

I am worried about the potential number of iterations for the inner most loop. Depending on the data and how often the IF statement is true, then you may be trying to execute the read statement millions, may be billions of times. This sounds inefficient and it may be better to read the contents of ba.t3 into a matrix at the beginning, and then refer to it rather than the SAS data set.

supmilk · Posted 08-23-2017 10:09 AM

Yes! The number of iteration is up to 2320(stocks)×120(days), totally, 248064(some are missing). And in each iteration, I need to read data of each stock in each day from ba.t3 to a matrix S. It is terribly inefficient, that one iteration cost me around 30 seconds and all the iterations will cost me 100 days, but I do not know how to improve it.

Rick_SAS · Posted 08-23-2017 08:09 AM

Since we are offering advice now, I want to point out that the loop to compute XX and XY can be replaced by more efficient statements that use vector multiplication:

/* to test, create any matrix S that has at least 16 rows and at least 4 coluumns */

/* replace current code that uses a loop... */
XX=0;
XY=0;
do ini=1 to 15;
	XX=XX+S[ini,4]*S[ini,4];
	XY=XY+S[ini+1,4]*S[ini,4];
end;
print XX XY;

/* ... with more efficient code that uses a vector inner product */
X = S[1:15,4];
Y = S[2:16,4];
XX = X` * X;
XY = X` * Y;
print XX XY;

Other loops can probably be similarly improved.

Rick_SAS · Posted 08-23-2017 10:15 AM

Yes! The number of iteration is up to 2320(stocks)×120(days), totally, 248064(some are missing). And in each iteration, I need to read data of each stock in each day from ba.t3 to a matrix S. It is terribly inefficient, that one iteration cost me around 30 seconds and all the iterations will cost me 100 days, but I do not know how to improve it.

The way to improve it is to post sample data (maybe 3 stocks x 4 days) including the N, T, and NT matrices.

Also, describe in words what you want the program to do. Then other programmers can make recommendations for improvements.

supmilk · Posted 08-24-2017 12:20 PM

Thank you for your suggestions.

table below is a sample of my data for one stock in one day(3×3 or more are too big to list), matrix N is the list of distinct SecCodes, T is the list of distinct TDates, and NT is the list of all the combinations of stock code and date.

The goal of my programming is to calculate an evolutionary AR(1) for every stock in every day. The data I got has 59782508 observations, which cover half year(120 days) and 2320 stocks. So I need to read the matrices from the data for 2320*120 times, which ask for a long time to run. The efficiency of the code is a big problem for me.

proc iml;
load N;
load D;
load NT;
load B;
use ba.t3;
do i=2320 to 2320;
  do t=120 to 120;
  f=N[i,1];
  g=D[t,1];
  B[i,t]=0;
    do c=1 to 248064;
    if NT[c,1]=f & NT[c,2]=g then do;
      read all where(SecCode=f & TDate=g) into S;
      XX=0;
      XY=0;
        do ini=1 to 15;
          XX=XX+S[ini,4]*S[ini,4];
	      XY=XY+S[ini+1,4]*S[ini,4];
        end;
      b_0=XY/XX;
      B[i,t]=b_0;
	end;
  end;
  end;
end;
store B;
quit;

Obs SecCode TDate MinTime StartPrc HighPrc LowPrc EndPrc MinTq MinTm

1	20151008	930	10.85	10.85	10.85	10.85	2160486	23441273.1
1	20151008	931	10.86	10.87	10.86	10.86	3495473	37987109.2
1	20151008	932	10.86	10.86	10.8	10.8	987641	10693639.26
1	20151008	933	10.8	10.8	10.77	10.78	889400	9598748
1	20151008	934	10.78	10.8	10.78	10.8	1218735	13125513.25
1	20151008	935	10.8	10.82	10.79	10.79	644800	6964091.82
1	20151008	936	10.79	10.82	10.78	10.8	418278	4519166.4
1	20151008	937	10.8	10.81	10.8	10.8	550422	5949659.6
1	20151008	938	10.8	10.8	10.78	10.79	403600	4354584.11
1	20151008	939	10.78	10.79	10.77	10.77	361400	3894113
1	20151008	940	10.76	10.77	10.75	10.75	578300	6221391.16
1	20151008	941	10.75	10.79	10.75	10.79	755382	8129020.25
1	20151008	942	10.8	10.81	10.76	10.77	862683	9301424.35
1	20151008	943	10.81	10.82	10.78	10.82	465500	5031375.8
1	20151008	944	10.78	10.81	10.78	10.79	119600	1291438
1	20151008	945	10.78	10.79	10.77	10.77	142800	1539261
1	20151008	946	10.77	10.8	10.77	10.79	476800	5140467
1	20151008	947	10.79	10.79	10.78	10.79	185300	1998904
1	20151008	948	10.79	10.79	10.78	10.79	366259	3949433.02
1	20151008	949	10.79	10.8	10.78	10.79	334812	3612752.6

Rick_SAS · Posted 08-24-2017 02:18 PM

Well, I can't make sense of what you are telling me and there is no way to run your program, so I give up.

Did you try using an ETS procedure? I am guessing that the best way to do this is to sort by SECCODE and TDATE and then use something like PROC AUTOREG and BY SECCODE TDATE to compute the regression coefficients. You can use the OUTEST= option to output the regression coefficients. Be sure to use ODS EXCLUDE ALL to suppress all the tables/graphs that you don't want. Your code might look somthing like this (untested)

ods exclude all;

proc autoreg data=mydata plots=none outest=ParamEst;

by SecCode TDate;

model StartPrc = / nlags=1; /* ??? */

run;quit;

ods exclude none;

supmilk · Posted 08-24-2017 05:57 PM

I can try to make it clear: I want split my data by SecCode and TDate, then calculate in the splitted data.

The data has been sorted, but still low efficient while i and t are large.

I appreciate your help very much.

Problem of Invalid subscript

Re: Problem of Invalid subscript

Re: Problem of Invalid subscript

Re: Problem of Invalid subscript

Re: Problem of Invalid subscript

Re: Problem of Invalid subscript

Re: Problem of Invalid subscript

Re: Problem of Invalid subscript

Re: Problem of Invalid subscript

Re: Problem of Invalid subscript

Re: Problem of Invalid subscript

Re: Problem of Invalid subscript

Re: Problem of Invalid subscript

Re: Problem of Invalid subscript

Catch up on SAS Innovate 2026