- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
I have multiple observations for each individual and I have a complex function of probabilties that I wish to calculate - it is a function of the first and second previous values. I manually calculate c2 and c3 in Part 1 below and then I use these to calculate the subsequent values c4, c5 etc in Part 2 until the last visit value for each ID. Unfortunately I had to repeat the code (as seen below in part two) several times since lag does not retain the value. Is there a more efficient way I could do this? I would greatly appreciate any assistance.
Thanks
data prob;set prob;by id;if first.id then visit=1;else visit+1;run;
/****Part 1: generate c2-c3**********/
data probc;set prob;by id;
if visit eq 2 then c=(lagprobgt1000*probgt1000) + (lagproblt1000*probgt1000) + (lagprobgt1000*problt1000);
if visit eq 3 then c= (lag2probgt1000*lagprobgt1000*probgt1000) + (lag2problt1000*lagprobgt1000*probgt1000) + (lag2probgt1000*lagproblt1000*probgt1000) + (lag2probgt1000*lagprobgt1000*problt1000) + (lag2problt1000*lagprobgt1000*problt1000);
run;
/***** Part 2 Recursively calculate using the initial probabilities******
This is not the most efficient way *****/
data probc2;set probc; by id; lagc = lag(c);
lag2c = lag2(c);
if c=. then c=(lag2c*lagprobgt1000*problt1000) + (lagc*probgt1000);
lagc = lag(c);
lag2c = lag2(c);
if c=. then c=(lag2c*lagprobgt1000*problt1000) + (lagc*probgt1000);
lagc = lag(c);
lag2c = lag2(c);
if c=. then c=(lag2c*lagprobgt1000*problt1000) + (lagc*probgt1000);
lagc = lag(c);
lag2c = lag2(c);
if c=. then c=(lag2c*lagprobgt1000*problt1000) + (lagc*probgt1000);
lagc = lag(c);
lag2c = lag2(c);
if c=. then c=(lag2c*lagprobgt1000*problt1000) + (lagc*probgt1000); lagc = lag(c);
lag2c = lag2(c);
if c=. then c=(lag2c*lagprobgt1000*problt1000) + (lagc*probgt1000);
lagc = lag(c);
lag2c = lag2(c);
if c=. then c=(lag2c*lagprobgt1000*problt1000) + (lagc*probgt1000);
lagc = lag(c);
lag2c = lag2(c);
if c=. then c=(lag2c*lagprobgt1000*problt1000) + (lagc*probgt1000); lagc = lag(c);
lag2c = lag2(c);
if c=. then c=(lag2c*lagprobgt1000*problt1000) + (lagc*probgt1000); lagc = lag(c);
lag2c = lag2(c);
if c=. then c=(lag2c*lagprobgt1000*problt1000) + (lagc*probgt1000); lagc = lag(c);
lag2c = lag2(c);
if c=. then c=(lag2c*lagprobgt1000*problt1000) + (lagc*probgt1000); lagc = lag(c);
lag2c = lag2(c);
if c=. then c=(lag2c*lagprobgt1000*problt1000) + (lagc*probgt1000); lagc = lag(c);run;
Accepted Solutions
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Actually since (P1*P2*Q3+Q1*P2*Q3) can be reduced to (P1+Q1)*P2*Q3 and P1+Q1 is just 1 you can define C1 = 1 and then you don't need separate definition for C3. This eliminates the need for second lag of the p values.
You also don't really need to second P value since it is just the compliment of the first.
data want ;
set have ;
by id day;
length visit c lagp lagc lag2c 8;
if first.id then call missing(of visit lag:);
visit+1;
select (visit);
when (1) c=1;
when (2) c=p + lagp - p*lagp ;
otherwise c=lag2c*lagp*(1-p) + lagc*p ;
end;
output;
lagp=p;
lag2c=lagc;
lagc=c;
retain lag: ;
run;
You can simpify even more if you just initialize the lagged values to 1.
data want ;
set have ;
by id day;
if first.id then do;
visit=0; lagp=1; lagc=1; lag2c=1;
end;
visit+1;
c=lag2c*lagp*(1-p) + lagc*p ;
output;
lagp=p;
lag2c=lagc;
lagc=c;
retain visit lagp lagc lag2c ;
run;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
There is definitely a smarter way to do this. Can you show some of your data in a data step?
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Sure. Thanks for the speedy response. See below
data test;
input day id probgt1000 problt1000 lagproblt1000 lag2problt1000 lag3problt1000 lagprobgt1000 lag2probgt1000 lag3probgt1000;
datalines;
0 1 0.999999838 1.62E-07
2 1 0.999991138 8.86E-06 1.62E-07 0.999999838
7 1 0.987787081 0.012212919 8.86E-06 1.62E-07 0.999991138 0.999999838
16 1 0.237545783 0.762454217 0.012212919 8.86E-06 1.62E-07 0.987787081 0.999991138 0.999999838
29 1 0.079696412 0.920303588 0.762454217 0.012212919 8.86E-06 0.237545783 0.987787081 0.999991138
57 1 0.001868728 0.998131272 0.920303588 0.762454217 0.012212919 0.079696412 0.237545783 0.987787081
91 1 8.13E-06 0.999991865 0.998131272 0.920303588 0.762454217 0.001868728 0.079696412 0.237545783
120 1 1.15E-07 0.999999885 0.999991865 0.998131272 0.920303588 8.13E-06 0.001868728 0.079696412
150 1 6.64E-10 0.999999999 0.999999885 0.999991865 0.998131272 1.15E-07 8.13E-06 0.001868728
180 1 1.76E-12 1 0.999999999 0.999999885 0.999991865 6.64E-10 1.15E-07 8.13E-06
210 1 2.11E-15 1 1 0.999999999 0.999999885 1.76E-12 6.64E-10 1.15E-07
240 1 0 1 1 1 0.999999999 2.11E-15 1.76E-12 6.64E-10
0 2 0.954524864 0.045475137
2 2 0.808897126 0.191102874 0.045475137 0.954524864
7 2 0.121488567 0.878511433 0.191102874 0.045475137 0.808897126 0.954524864
14 2 2.84197E-05 0.99997158 0.878511433 0.191102874 0.045475137 0.121488567 0.808897126 0.954524864
29 2 2.70E-06 0.999997298 0.99997158 0.878511433 0.191102874 2.84197E-05 0.121488567 0.808897126
56 2 2.01E-08 0.99999998 0.999997298 0.99997158 0.878511433 2.70E-06 2.84197E-05 0.121488567
85 2 6.75E-11 1 0.99999998 0.999997298 0.99997158 2.01E-08 2.70E-06 2.84197E-05
120 2 7.38E-14 1 1 0.99999998 0.999997298 6.75E-11 2.01E-08 2.70E-06
150 2 1.11E-16 1 1 1 0.99999998 7.38E-14 6.75E-11 2.01E-08
167 2 0 1 1 1 1 1.11E-16 7.38E-14 6.75E-11
180 2 0 1 1 1 1 0 1.11E-16 7.38E-14
0 3 0.999999999 6.66E-10
7 3 0.999326107 0.000673893 6.66E-10 0.999999999
9 3 0.99155374 0.00844626 0.000673893 6.66E-10 0.999326107 0.999999999
14 3 0.635937832 0.364062168 0.00844626 0.000673893 6.66E-10 0.99155374 0.999326107 0.999999999
28 3 0.523640435 0.476359565 0.364062168 0.00844626 0.000673893 0.635937832 0.99155374 0.999326107
56 3 0.302450412 0.697549588 0.476359565 0.364062168 0.00844626 0.523640435 0.635937832 0.99155374
84 3 0.355108729 0.644891271 0.697549588 0.476359565 0.364062168 0.302450412 0.523640435 0.635937832
90 3 0.475417732 0.524582268 0.644891271 0.697549588 0.476359565 0.355108729 0.302450412 0.523640435
120 3 0.931607984 0.068392016 0.524582268 0.644891271 0.697549588 0.475417732 0.355108729 0.302450412
150 3 0.998806882 0.001193118 0.068392016 0.524582268 0.644891271 0.931607984 0.475417732 0.355108729
166 3 0.999944187 0.000055813 0.001193118 0.068392016 0.524582268 0.998806882 0.931607984 0.475417732
180 3 0.999997751 2.25E-06 0.000055813 0.001193118 0.068392016 0.999944187 0.998806882 0.931607984
210 3 1 4.22E-10 2.25E-06 0.000055813 0.001193118 0.999997751 0.999944187 0.998806882
240 3 1 7.59E-15 4.22E-10 2.25E-06 0.000055813 1 0.999997751 0.999944187
270 3 1 1.29E-20 7.59E-15 4.22E-10 2.25E-06 1 1 0.999997751
300 3 1 2.02E-27 1.29E-20 7.59E-15 4.22E-10 1 1 1
330 3 1 2.95E-35 2.02E-27 1.29E-20 7.59E-15 1 1 1
360 3 1 3.95E-44 2.95E-35 2.02E-27 1.29E-20 1 1 1
390 3 1 4.86E-54 3.95E-44 2.95E-35 2.02E-27 1 1 1
420 3 1 5.47E-65 4.86E-54 3.95E-44 2.95E-35 1 1 1
450 3 1 5.62E-77 5.47E-65 4.86E-54 3.95E-44 1 1 1
0 4 0.999588107 0.000411893
2 4 0.994266075 0.005733926 0.000411893 0.999588107
7 4 0.686761914 0.313238086 0.005733926 0.000411893 0.994266075 0.999588107
10 4 0.230209335 0.769790666 0.313238086 0.005733926 0.000411893 0.686761914 0.994266075 0.999588107
14 4 0.008863027 0.991136973 0.769790666 0.313238086 0.005733926 0.230209335 0.686761914 0.994266075
28 4 0.007810872 0.992189128 0.991136973 0.769790666 0.313238086 0.008863027 0.230209335 0.686761914
56 4 0.006031055 0.993968945 0.992189128 0.991136973 0.769790666 0.007810872 0.008863027 0.230209335
86 4 0.026406683 0.973593317 0.993968945 0.992189128 0.991136973 0.006031055 0.007810872 0.008863027
90 4 0.042237651 0.957762349 0.973593317 0.993968945 0.992189128 0.026406683 0.006031055 0.007810872
120 4 0.443685146 0.556314854 0.957762349 0.973593317 0.993968945 0.042237651 0.026406683 0.006031055
150 4 0.925352277 0.074647723 0.556314854 0.957762349 0.973593317 0.443685146 0.042237651 0.026406683
180 4 0.998759633 0.001240367 0.074647723 0.556314854 0.957762349 0.925352277 0.443685146 0.042237651
210 4 0.99999798 2.02E-06 0.001240367 0.074647723 0.556314854 0.998759633 0.925352277 0.443685146
240 4 1 2.95E-10 2.02E-06 0.001240367 0.074647723 0.99999798 0.998759633 0.925352277
270 4 1 3.72E-15 2.95E-10 2.02E-06 0.001240367 1 0.99999798 0.998759633
300 4 1 3.98E-21 3.72E-15 2.95E-10 2.02E-06 1 1 0.99999798
330 4 1 3.55E-28 3.98E-21 3.72E-15 2.95E-10 1 1 1
360 4 1 2.64E-36 3.55E-28 3.98E-21 3.72E-15 1 1 1
390 4 1 1.62E-45 2.64E-36 3.55E-28 3.98E-21 1 1 1
420 4 1 8.19E-56 1.62E-45 2.64E-36 3.55E-28 1 1 1
450 4 1 3.41E-67 8.19E-56 1.62E-45 2.64E-36 1 1 1
0 5 0.999999999 9.08E-10
8 5 0.996994532 0.003005468 9.08E-10 0.999999999
9 5 0.990329715 0.009670285 0.003005468 9.08E-10 0.996994532 0.999999999
14 5 0.616951368 0.383048632 0.009670285 0.003005468 9.08E-10 0.990329715 0.996994532 0.999999999
28 5 0.066678872 0.933321128 0.383048632 0.009670285 0.003005468 0.616951368 0.990329715 0.996994532
56 5 1.72E-07 0.999999828 0.933321128 0.383048632 0.009670285 0.066678872 0.616951368 0.990329715
90 5 1.13E-13 1 0.999999828 0.933321128 0.383048632 1.72E-07 0.066678872 0.616951368
120 5 3.31E-13 1 1 0.999999828 0.933321128 1.13E-13 1.72E-07 0.066678872
150 5 9.52E-13 1 1 1 0.999999828 3.31E-13 1.13E-13 1.72E-07
168 5 1.78E-12 1 1 1 1 9.52E-13 3.31E-13 1.13E-13
180 5 2.68E-12 1 1 1 1 1.78E-12 9.52E-13 3.31E-13
210 5 7.39E-12 1 1 1 1 2.68E-12 1.78E-12 9.52E-13
240 5 1.99E-11 1 1 1 1 7.39E-12 2.68E-12 1.78E-12
270 5 5.27E-11 1 1 1 1 1.99E-11 7.39E-12 2.68E-12
300 5 1.37E-10 1 1 1 1 5.27E-11 1.99E-11 7.39E-12
330 5 3.47E-10 1 1 1 1 1.37E-10 5.27E-11 1.99E-11
360 5 8.61E-10 0.999999999 1 1 1 3.47E-10 1.37E-10 5.27E-11
390 5 2.10E-09 0.999999998 0.999999999 1 1 8.61E-10 3.47E-10 1.37E-10
420 5 5.00E-09 0.999999995 0.999999998 0.999999999 1 2.10E-09 8.61E-10 3.47E-10
450 5 1.17E-08 0.999999988 0.999999995 0.999999998 0.999999999 5.00E-09 2.10E-09 8.61E-10
0 6 0.999999992 7.86E-09
2 6 0.999999341 6.59E-07 7.86E-09 0.999999992
6 6 0.999321861 0.000678139 6.59E-07 7.86E-09 0.999999341 0.999999992
9 6 0.976089034 0.023910966 0.000678139 6.59E-07 7.86E-09 0.999321861 0.999999341 0.999999992
13 6 0.635258637 0.364741363 0.023910966 0.000678139 6.59E-07 0.976089034 0.999321861 0.999999341
27 6 0.386715096 0.613284904 0.364741363 0.023910966 0.000678139 0.635258637 0.976089034 0.999321861
51 6 0.240699785 0.759300215 0.613284904 0.364741363 0.023910966 0.386715096 0.635258637 0.976089034
90 6 0.112612642 0.887387358 0.759300215 0.613284904 0.364741363 0.240699785 0.386715096 0.635258637
120 6 0.077983686 0.922016314 0.887387358 0.759300215 0.613284904 0.112612642 0.240699785 0.386715096
150 6 0.052104636 0.947895364 0.922016314 0.887387358 0.759300215 0.077983686 0.112612642 0.240699785
180 6 0.033565496 0.966434504 0.947895364 0.922016314 0.887387358 0.052104636 0.077983686 0.112612642
210 6 0.02083448 0.979165521 0.966434504 0.947895364 0.922016314 0.033565496 0.052104636 0.077983686
240 6 0.012454003 0.987545997 0.979165521 0.966434504 0.947895364 0.02083448 0.033565496 0.052104636
270 6 0.007165821 0.99283418 0.987545997 0.979165521 0.966434504 0.012454003 0.02083448 0.033565496
300 6 0.003967102 0.996032898 0.99283418 0.987545997 0.979165521 0.007165821 0.012454003 0.02083448
330 6 0.002112387 0.997887613 0.996032898 0.99283418 0.987545997 0.003967102 0.007165821 0.012454003
360 6 0.001081502 0.998918498 0.997887613 0.996032898 0.99283418 0.002112387 0.003967102 0.007165821
390 6 0.000532246 0.999467754 0.998918498 0.997887613 0.996032898 0.001081502 0.002112387 0.003967102
420 6 0.000251721 0.999748279 0.999467754 0.998918498 0.997887613 0.000532246 0.001081502 0.002112387
450 6 0.000114381 0.999885619 0.999748279 0.999467754 0.998918498 0.000251721 0.000532246 0.001081502
;
run;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Can you show us what you want your data to look like?
In your first part of your code you are referencing a variable named visit, but no such variable exists in your provided dataset.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
The variable visit is generated by the first line in the code I posted : /**** generate observation number**********/
data prob;set prob;by id;if first.id then visit=1;else visit+1;run;
I am attaching the output for the variable c that I expect.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
@T_Reddy wrote:
Sure. Thanks for the speedy response. See below
data test;
input day id probgt1000 problt1000 lagproblt1000 lag2problt1000 lag3problt1000 lagprobgt1000 lag2probgt1000 lag3probgt1000;
datalines;
0 1 0.999999838 1.62E-07
2 1 0.999991138 8.86E-06 1.62E-07 0.999999838
7 1 0.987787081 0.012212919 8.86E-06 1.62E-07 0.999991138 0.999999838
16 1 0.237545783 0.762454217 0.012212919 8.86E-06 1.62E-07 0.987787081 0.999991138 0.999999838
29 1 0.079696412 0.920303588 0.762454217 0.012212919 8.86E-06 0.237545783 0.987787081 0.999991138
57 1 0.001868728 0.998131272 0.920303588 0.762454217 0.012212919 0.079696412 0.237545783 0.987787081
91 1 8.13E-06 0.999991865 0.998131272 0.920303588 0.762454217 0.001868728 0.079696412 0.237545783
120 1 1.15E-07 0.999999885 0.999991865 0.998131272 0.920303588 8.13E-06 0.001868728 0.079696412
150 1 6.64E-10 0.999999999 0.999999885 0.999991865 0.998131272 1.15E-07 8.13E-06 0.001868728
Is this supposed to be similar to your PROB dataset? It is missing the variable VISIT which is needed to calculate C.
Hence C is missing in Probc. And then all of the LAG values for c are missing and all of the calculated c ae missing in probc2.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
data prob;set test;by id;if first.id then visit=1;else visit+1;run;
/****Part 1: generate c2-c3**********/
data probc;set prob;by id;
if visit eq 2 then c=(lagprobgt1000*probgt1000) + (lagproblt1000*probgt1000) + (lagprobgt1000*problt1000);
if visit eq 3 then c= (lag2probgt1000*lagprobgt1000*probgt1000) + (lag2problt1000*lagprobgt1000*probgt1000) + (lag2probgt1000*lagproblt1000*probgt1000) + (lag2probgt1000*lagprobgt1000*problt1000) + (lag2problt1000*lagprobgt1000*problt1000);
run;
/***** Part 2 Recursively calculate using the initial probabilities******
This is not the most efficient way *****/
data probc2;set probc; by id; lagc = lag(c);
lag2c = lag2(c);
if c=. then c=(lag2c*lagprobgt1000*problt1000) + (lagc*probgt1000);
lagc = lag(c);
lag2c = lag2(c);
if c=. then c=(lag2c*lagprobgt1000*problt1000) + (lagc*probgt1000);
lagc = lag(c);
lag2c = lag2(c);
if c=. then c=(lag2c*lagprobgt1000*problt1000) + (lagc*probgt1000);
etc
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
You posted way more data than we need to test. Too many observations and too many variables.
It looks like you have just ID, VISIT, DAY , PROB and your EXPECTED value for C. The other probability is just 1-PROB and the other variables are just lags of the two prob variables which are easy to calculate.
data have ;
infile cards dsd dlm='|' truncover ;
input id visit day prob expect ;
cards;
1|1|0|0.999999838|
1|2|2|0.999991138|1
1|3|7|0.987787081|0.999999892
1|4|16|0.237545783|0.990688183
1|5|29|0.079696412|0.297568507
1|6|57|0.001868728|0.079362825
1|7|91|8.13E-6|0.000556716
1|8|120|1.15E-7|6.46E-7
1|9|150|6.64E-10|6.43E-11
1|10|180|1.76E-12|4.29E-16
1|11|210|2.11E-15|1.13E-22
1|12|240|0|9.05E-31
;
proc print;
format _numeric_ best32.;
run;
You said the problem was recursive. So to do that we need to know two things.
1) How to calculate the initial value. So it looks like when visit=1 it is unknown and when visit=2 it is 1.
2) How to calculate Nth value given the previous values.
From your post it looks like formula is based on the values from the previous 2 values some how.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
The initial values are c2 and c3. c1 is essential zero.
This is how the initial values c2 and c3 are calculated:
data probc;set prob;by id;
if visit eq 2 then c=(lagproblt1000*problt1000) + (lagprobgt1000*problt1000) + (lagproblt1000*probgt1000);
if visit eq 3 then c= (lag2problt1000*lagproblt1000*problt1000) + (lag2probgt1000*lagproblt1000*problt1000) + (lag2problt1000*lagprobgt1000*problt1000) + (lag2problt1000*lagproblt1000*probgt1000) + (lag2probgt1000*lagproblt1000*probgt1000);
run;
Thereafter for visit>3
c=(lag2c*lagproblt1000*probgt1000) + (lagc*problt1000)
where lag2c is the value of c two visits earlier and lagc is the value of c one visit earlier.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Getting closer to a problem definition.
Let's use the notation C1,C2,C3,.... for the new series and P1,P2,P3,... for the original series. Also let's add the notation Q1,Q2,Q3,... to mean the series defined by (1-P). So your ProbLT1000 variable is the P series and your ProbGT1000 variable is the Q series.
If I am reading your code properly your definition is:
- C1 = 0
- C2 = P2*Q2 + Q1*P2 + P1*Q2
- C3 = P1*P2*P3 + Q1*P2*P3 + P1*Q2*P3 + P1*P2*Q3 + Q1*P2*Q3
- C4 = C2*P3*Q3 + C3*P4
- Cn+1 = Cn-1*Pn*Qn + Cn*Pn+1
Did I translate that properly?
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Yes, exactly!
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
I think I have translated this into code now. So here is sample input data. We really just need the ID, DAY (to insure the order is right) and P value. I added in your expected C series.
data have ;
infile cards dsd dlm='|' truncover ;
input id day p expect ;
cards;
1|0|0.999999838|
1|2|0.999991138|1
1|7|0.987787081|0.999999892
1|16|0.237545783|0.990688183
1|29|0.079696412|0.297568507
1|57|0.001868728|0.079362825
1|91|8.13E-6|0.000556716
1|120|1.15E-7|6.46E-7
1|150|6.64E-10|6.43E-11
1|180|1.76E-12|4.29E-16
1|210|2.11E-15|1.13E-22
1|240|0|9.05E-31
;
Now set that data and you can calculate the variables needed for the formulas from those values. Instead of trying to get LAG() function to work I just used RETAIN. I moved the OUTPUT to before I generate the lagged values so you can see on the output the values used to calculate C.
data want ;
set have ;
by id day;
length visit c lagp lag2p q lagq lag2q lagc lag2c 8;
if first.id then call missing(of visit lag:);
visit+1;
q=1-p;
select (visit);
when (1) c=.;
when (2) c=lagp*p + lagq*p + lagp*q ;
when (3) c=lag2p*lagp*p + lag2q*lagp*p + lag2p*lagp*q + lag2q*lagp*q + lag2p*lagq*p;
otherwise c=lag2c*lagp*q + lagc*p ;
end;
output;
lag2p=lagp;
lagp=p;
lag2q=lagq;
lagq=q;
lag2c=lagc;
lagc=c;
retain lag: ;
run;
proc print; run;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Actually since (P1*P2*Q3+Q1*P2*Q3) can be reduced to (P1+Q1)*P2*Q3 and P1+Q1 is just 1 you can define C1 = 1 and then you don't need separate definition for C3. This eliminates the need for second lag of the p values.
You also don't really need to second P value since it is just the compliment of the first.
data want ;
set have ;
by id day;
length visit c lagp lagc lag2c 8;
if first.id then call missing(of visit lag:);
visit+1;
select (visit);
when (1) c=1;
when (2) c=p + lagp - p*lagp ;
otherwise c=lag2c*lagp*(1-p) + lagc*p ;
end;
output;
lagp=p;
lag2c=lagc;
lagc=c;
retain lag: ;
run;
You can simpify even more if you just initialize the lagged values to 1.
data want ;
set have ;
by id day;
if first.id then do;
visit=0; lagp=1; lagc=1; lag2c=1;
end;
visit+1;
c=lag2c*lagp*(1-p) + lagc*p ;
output;
lagp=p;
lag2c=lagc;
lagc=c;
retain visit lagp lagc lag2c ;
run;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
You need to post sample data and expected output and a small subset of your code - we probably aren't going to read through a lot of lines. Basically a simplified version but complex enough to demonstrate the issues.
But - you could look into either array processing, and/or IML which is matrix language and this seems more in line with this type of calculation.