I need to find the most efficient and fastest way to compute :
{sum over t} ({product over j} (X[j-1])*Y[t])
For a high enough j and t and with a couple of more variables in the equation this needs a lot of variables(columns) to be created which seem inefficient. I was thinking that I can define array with macro variables for X(j), Y(t) and such vars, but if it's fastest or is there a better way with PROC IML or any other trick. Any suggestions?
Thanks a lot in advance.
P.S.-X, Y are numerical variables between (0,1). Also 't' varies from 1 to n and 'j' varies from 1 to 't'.
Where are you getting those numbers? Those are not the results of the program I sent.
Anyway, to answer your question, it sounds like you want to use the matrix whose columns contain the cumulative probabilities.of 1 - p_Die.
proc iml;
p_Race = {0.26, 0.35, 0.23, 0.16};
p_Die = {
0.1 0.2 0.3,
0.16 0.25 0.45,
0.23 0.34 0.17,
0.18 0.17 0.14 };
/* form matrix of cumulative products of columns */
p_cumul = 1 - p_Die;
do j = 2 to ncol(p_cumul);
p_cumul[,j] = p_cumul[,j] # p_cumul[,j-1];
end;
M = p_Race # p_cumul;
print M;
What does X and Y look like here?
It would be best to provide example data and the result you expect.
When t=1, do you skip the product? If so, then t ranges from 2 to N.
I'm assuming that the outer summation will go from t=2..N. Otherwise, you can modify the code accordingly. The key to an efficient implementation is to recognize that the formula is separable and factors into a cumulative product and the powers of the elements of Y:
proc iml;
x = (1:4)`/5;
y = (2:5)`/5;
N = nrow(y);
/* naive loop: sum of product.
For testing purposes only! */
s = 0;
do t = 2 to N;
p = 1;
do j = 2 to t;
p = p* (x[j-1]#y[t]);
end;
s = s + p;
end;
print s;
/* efficient computation: problem factors into the product of X
and sum of powers of Y */
xT = cuprod(x); /* cumulative product */
yT = y##(0:N-1)`; /* powers of Y */
v = xT[1:N-1] # yT[2:N]; /* elementwise product */
s2 = sum(v);
print s2;
If you provide data and the expect results, I and other experts will think about it.
Suppose a person can belong to any of the RACEs with the probability of PROB_RACE. And marginal probability of dying in years 1 to 3 is given(marginal survival will be 1-PROB_DYING):
RACE | PROB_RACE | PROB_DYING_YR1 | PROB_DYING_YR2 | PROB_DYING_YR3 |
A | 0.26 | 0.1 | 0.2 | 0.3 |
B | 0.35 | 0.16 | 0.25 | 0.45 |
C | 0.23 | 0.23 | 0.34 | 0.17 |
D | 0.16 | 0.18 | 0.17 | 0.14 |
So the person belonging to RACE A will have a probability of survival after 3 years, call it SURV3_A = (1-0.1)*(1-0.2)*(1-0.3).
Similarly, SURV3_B=(1-0.16)*(1-0.25)*(1-0.45) and so on...
And a random person will have a probability of survival after 3 years as : PROB_A*SURV3_A+PROB_B*SURV3_B+PROB_C*SURV3_C+PROB_D*SURV3_D.
I was thinking of a generalization to calculate probability of survival of any random person after 'T' years when there can be 'N' possible RACEs.
I can think of taking one row at a time,using CUPROD, then summing across columns which seem inefficient, so wondering if any easy way?
Aha! Now we see what you are trying to do! Much clearer and much simpler. I think the "trick" you are looking for is to use a subscript reduction operator for each row and across the columns:
proc iml;
p_Race = {0.26, 0.35, 0.23, 0.16};
p_Die = {
0.1 0.2 0.3,
0.16 0.25 0.45,
0.23 0.34 0.17,
0.18 0.17 0.14 };
p_surv = p_Race # (1 - p_Die)[, #];
print p_surv;
Hi @Rick_SAS, sorry for being superxcited yesterday and not verifying everything. The trick you suggested gives only final column:
0.504 |
0.3465 |
0.421806 |
0.585316 |
Is there a way to get all the columns in the range, like:
0.9 | 0.72 | 0.504 |
0.84 | 0.63 | 0.3465 |
0.77 | 0.5082 | 0.421806 |
0.82 | 0.6806 | 0.585316 |
Thanks a lot in advance.
This same question is cross=posted.
Where are you getting those numbers? Those are not the results of the program I sent.
Anyway, to answer your question, it sounds like you want to use the matrix whose columns contain the cumulative probabilities.of 1 - p_Die.
proc iml;
p_Race = {0.26, 0.35, 0.23, 0.16};
p_Die = {
0.1 0.2 0.3,
0.16 0.25 0.45,
0.23 0.34 0.17,
0.18 0.17 0.14 };
/* form matrix of cumulative products of columns */
p_cumul = 1 - p_Die;
do j = 2 to ncol(p_cumul);
p_cumul[,j] = p_cumul[,j] # p_cumul[,j-1];
end;
M = p_Race # p_cumul;
print M;
Sorry, I only gave calcs for (1-X) part. Other parts were working properly, so I guessed if I get the solution to this one, it'll be solved. My apologies.