Suppose a person can belong to any of the RACEs with the probability of PROB_RACE. And marginal probability of dying in years 1 to 3 is given(marginal survival will be 1-PROB_DYING):
RACE | PROB_RACE | PROB_DYING_YR1 | PROB_DYING_YR2 | PROB_DYING_YR3 |
A | 0.26 | 0.1 | 0.2 | 0.3 |
B | 0.35 | 0.16 | 0.25 | 0.45 |
C | 0.23 | 0.23 | 0.34 | 0.17 |
D | 0.16 | 0.18 | 0.17 | 0.14 |
So the person belonging to RACE A will have a probability of survival after 3 years, call it SURV3_A = (1-0.1)*(1-0.2)*(1-0.3).
Similarly, SURV3_B=(1-0.16)*(1-0.25)*(1-0.45) and so on...
And a random person will have a probability of survival after 3 years as : PROB_A*SURV3_A+PROB_B*SURV3_B+PROB_C*SURV3_C+PROB_D*SURV3_D.
I was thinking of a generalization to calculate probability of survival of any random person after 'T' years when there can be 'N' possible RACEs.
I can think of taking one row at a time,using CUPROD, then summing across columns which seem inefficient, so wondering if any easy way?
Just want to add that it is considered bad etiquette to post the same question twice and solicit advice in two separate threads.
I answered in the other thread. Good luck with your problem.
Implementing your base formula is trivial in a data step. Can you show what you expect as the final, final output, assuming this is the input? I want to ensure there's nothing I'm missing before taking a further look or providing any code.
@thepushkarsingh wrote:
Suppose a person can belong to any of the RACEs with the probability of PROB_RACE. And marginal probability of dying in years 1 to 3 is given(marginal survival will be 1-PROB_DYING):
RACE PROB_RACE PROB_DYING_YR1 PROB_DYING_YR2 PROB_DYING_YR3 A 0.26 0.1 0.2 0.3 B 0.35 0.16 0.25 0.45 C 0.23 0.23 0.34 0.17 D 0.16 0.18 0.17 0.14
So the person belonging to RACE A will have a probability of survival after 3 years, call it SURV3_A = (1-0.1)*(1-0.2)*(1-0.3).
Similarly, SURV3_B=(1-0.16)*(1-0.25)*(1-0.45) and so on...
And a random person will have a probability of survival after 3 years as : PROB_A*SURV3_A+PROB_B*SURV3_B+PROB_C*SURV3_C+PROB_D*SURV3_D.
I was thinking of a generalization to calculate probability of survival of any random person after 'T' years when there can be 'N' possible RACEs.
I can think of taking one row at a time,using CUPROD, then summing across columns which seem inefficient, so wondering if any easy way?
Hi @Reeza, apologies for earlier responses. I have following matrix, X:
0.1 | 0.2 | 0.3 |
0.16 | 0.25 | 0.45 |
0.23 | 0.34 | 0.17 |
0.18 | 0.17 | 0.14 |
(1-X) will be:
0.9 | 0.8 | 0.7 |
0.84 | 0.75 | 0.55 |
0.77 | 0.66 | 0.83 |
0.82 | 0.83 | 0.86 |
And I need cumulative product at each row like:
0.9 | 0.72 | 0.504 |
0.84 | 0.63 | 0.3465 |
0.77 | 0.5082 | 0.421806 |
0.82 | 0.6806 | 0.585316 |
Then I need column wise sum like:
3.33 | 2.5388 | 1.857622 |
And in the end, a row sum :
7.726422 |
I was superexcited with @Rick_SAS's solution which suggested use of (1-X)[,#], but it only gives me final column. Any suggestion?
Thanks a lot in advance.
Just want to add that it is considered bad etiquette to post the same question twice and solicit advice in two separate threads.
I answered in the other thread. Good luck with your problem.
I am very sorry, while I was posting further information in your solution, I saw the section where it said 'If you need further help post a new question', so I marked the original post as solved and posted it again separately. I thought it would be disrespectful if I don't tell you about my inquiries, so I posted there too. I didn't mean to disrespect, which seemingly I did eventually.
I accept your apology, but let me clarify that the issue isn't that cross-posting is disrespectful. When there are two threads, it is hard for experts to work from the same information and to build a program that solves your problem. It will also be difficult for someone who has a similar question in the future to solve their problem by consulting these pages. There are ways to merge the two threads into one, but they have diverged so much that I'm not sure it would be helpful. I have a hard time understanding these threads even though I was involved in writing one of them.
I would like to encourage you to revisit the "accepted solution" that you marked in the other thread. After you marked my initial response as correct, you provided extra information and I posted additional programs. I don't know whether the later solutions are better, but please mark the answer that best resolves your problem.
Thanks for understanding. I'll not repeat same mistake.
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Learn how to run multiple linear regression models with and without interactions, presented by SAS user Alex Chaplin.
Find more tutorials on the SAS Users YouTube channel.