I need to find the most efficient and fastest way to compute :
{sum over t} ({product over j} (X[j-1])*Y[t])
For a high enough j and t and with a couple of more variables in the equation this needs a lot of variables(columns) to be created which seem inefficient. I was thinking that I can define array with macro variables for X(j), Y(t) and such vars, but if it's fastest or is there a better way with PROC IML or any other trick. Any suggestions?
Thanks a lot in advance.
P.S.-X, Y are numerical variables between (0,1). Also 't' varies from 1 to n and 'j' varies from 1 to 't'.
Where are you getting those numbers? Those are not the results of the program I sent.
Anyway, to answer your question, it sounds like you want to use the matrix whose columns contain the cumulative probabilities.of 1 - p_Die.
proc iml;
p_Race = {0.26, 0.35, 0.23, 0.16};
p_Die = {
0.1 0.2 0.3,
0.16 0.25 0.45,
0.23 0.34 0.17,
0.18 0.17 0.14 };
/* form matrix of cumulative products of columns */
p_cumul = 1 - p_Die;
do j = 2 to ncol(p_cumul);
p_cumul[,j] = p_cumul[,j] # p_cumul[,j-1];
end;
M = p_Race # p_cumul;
print M;
What does X and Y look like here?
It would be best to provide example data and the result you expect.
When t=1, do you skip the product? If so, then t ranges from 2 to N.
I'm assuming that the outer summation will go from t=2..N. Otherwise, you can modify the code accordingly. The key to an efficient implementation is to recognize that the formula is separable and factors into a cumulative product and the powers of the elements of Y:
proc iml;
x = (1:4)`/5;
y = (2:5)`/5;
N = nrow(y);
/* naive loop: sum of product.
For testing purposes only! */
s = 0;
do t = 2 to N;
p = 1;
do j = 2 to t;
p = p* (x[j-1]#y[t]);
end;
s = s + p;
end;
print s;
/* efficient computation: problem factors into the product of X
and sum of powers of Y */
xT = cuprod(x); /* cumulative product */
yT = y##(0:N-1)`; /* powers of Y */
v = xT[1:N-1] # yT[2:N]; /* elementwise product */
s2 = sum(v);
print s2;
If you provide data and the expect results, I and other experts will think about it.
Suppose a person can belong to any of the RACEs with the probability of PROB_RACE. And marginal probability of dying in years 1 to 3 is given(marginal survival will be 1-PROB_DYING):
RACE | PROB_RACE | PROB_DYING_YR1 | PROB_DYING_YR2 | PROB_DYING_YR3 |
A | 0.26 | 0.1 | 0.2 | 0.3 |
B | 0.35 | 0.16 | 0.25 | 0.45 |
C | 0.23 | 0.23 | 0.34 | 0.17 |
D | 0.16 | 0.18 | 0.17 | 0.14 |
So the person belonging to RACE A will have a probability of survival after 3 years, call it SURV3_A = (1-0.1)*(1-0.2)*(1-0.3).
Similarly, SURV3_B=(1-0.16)*(1-0.25)*(1-0.45) and so on...
And a random person will have a probability of survival after 3 years as : PROB_A*SURV3_A+PROB_B*SURV3_B+PROB_C*SURV3_C+PROB_D*SURV3_D.
I was thinking of a generalization to calculate probability of survival of any random person after 'T' years when there can be 'N' possible RACEs.
I can think of taking one row at a time,using CUPROD, then summing across columns which seem inefficient, so wondering if any easy way?
Aha! Now we see what you are trying to do! Much clearer and much simpler. I think the "trick" you are looking for is to use a subscript reduction operator for each row and across the columns:
proc iml;
p_Race = {0.26, 0.35, 0.23, 0.16};
p_Die = {
0.1 0.2 0.3,
0.16 0.25 0.45,
0.23 0.34 0.17,
0.18 0.17 0.14 };
p_surv = p_Race # (1 - p_Die)[, #];
print p_surv;
Hi @Rick_SAS, sorry for being superxcited yesterday and not verifying everything. The trick you suggested gives only final column:
0.504 |
0.3465 |
0.421806 |
0.585316 |
Is there a way to get all the columns in the range, like:
0.9 | 0.72 | 0.504 |
0.84 | 0.63 | 0.3465 |
0.77 | 0.5082 | 0.421806 |
0.82 | 0.6806 | 0.585316 |
Thanks a lot in advance.
This same question is cross=posted.
Where are you getting those numbers? Those are not the results of the program I sent.
Anyway, to answer your question, it sounds like you want to use the matrix whose columns contain the cumulative probabilities.of 1 - p_Die.
proc iml;
p_Race = {0.26, 0.35, 0.23, 0.16};
p_Die = {
0.1 0.2 0.3,
0.16 0.25 0.45,
0.23 0.34 0.17,
0.18 0.17 0.14 };
/* form matrix of cumulative products of columns */
p_cumul = 1 - p_Die;
do j = 2 to ncol(p_cumul);
p_cumul[,j] = p_cumul[,j] # p_cumul[,j-1];
end;
M = p_Race # p_cumul;
print M;
Sorry, I only gave calcs for (1-X) part. Other parts were working properly, so I guessed if I get the solution to this one, it'll be solved. My apologies.
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.
Learn how to run multiple linear regression models with and without interactions, presented by SAS user Alex Chaplin.
Find more tutorials on the SAS Users YouTube channel.