Greetings,
I have a rather peculiar data situation that I am stuck on.
I have a list of sample id's that undergoes 2 tests during each session: a RED & BLUE test. The test result values are either 0 (negative) or 1 (positive).
To make a diagnosis based on either tests, 2 consecutive positive tests results are required. However, there are sessions where the test results are missing (denoted by "."). It is OK for these "missing" sessions to be in between 2 consecutive positive results.
The data looks like:
Note that the test diagnosis is currently made at the 2nd consecutive positive result session. Furthermore, after a diagnosis is made, it is kept through the sample's following sessions, until a diagnosis based on a different test is made by the consecutive rule (see ID 2).
I would like to update the diagnosis column to where the test diagnosis is made at the 1st of the 2-consecutive-positive-result sessions, as indicated in the last column "desired diagnosis output".
How would I go about updating the previous session rows?
Thank you for your help!
How about this? I assume that "Blue" Trumps "Red" and that only those two are considered.
Feel free to ask 🙂
data have;
input id session red_test blue_test;
datalines;
1 1 0 0
1 3 1 0
1 4 1 0
1 5 1 1
1 6 1 0
2 1 1 0
2 2 1 1
2 3 0 0
2 5 0 0
2 7 0 1
2 8 0 1
2 9 0 0
3 1 1 0
3 2 . .
3 3 1 0
3 4 0 1
;
data want;
set have curobs = c;
by id;
if first.id then diagnosis = ' ';
if diagnosis = ' ' then do p = (c + 1) by 1;
if p > n then leave;
set have (keep = id red_test rename = (id = i red_test = r)) point = p nobs = n;
if id ne i then leave;
if red_test = 1 and r = 0 then leave;
if red_test = 1 and r = 1 then do;
diagnosis = 'RED ';
leave;
end;
end;
else if diagnosis in (' ', 'RED ') then do p = (c + 1) by 1;
if p > n then leave;
set have (keep = id blue_test rename = (id = i blue_test = b)) point = p nobs = n;
if id ne i then leave;
if blue_test = 1 and b = 0 then leave;
if blue_test = 1 and b = 1 then do;
diagnosis = 'BLUE';
leave;
end;
end;
retain diagnosis;
drop i r b;
run;
Result:
id session red_test blue_test diagnosis 1 1 0 0 1 3 1 0 RED 1 4 1 0 RED 1 5 1 1 RED 1 6 1 0 RED 2 1 1 0 RED 2 2 1 1 RED 2 3 0 0 RED 2 5 0 0 RED 2 7 0 1 BLUE 2 8 0 1 BLUE 2 9 0 0 BLUE 3 1 1 0 RED 3 2 . . RED 3 3 1 0 RED 3 4 0 1 RED
How about this? I assume that "Blue" Trumps "Red" and that only those two are considered.
Feel free to ask 🙂
data have;
input id session red_test blue_test;
datalines;
1 1 0 0
1 3 1 0
1 4 1 0
1 5 1 1
1 6 1 0
2 1 1 0
2 2 1 1
2 3 0 0
2 5 0 0
2 7 0 1
2 8 0 1
2 9 0 0
3 1 1 0
3 2 . .
3 3 1 0
3 4 0 1
;
data want;
set have curobs = c;
by id;
if first.id then diagnosis = ' ';
if diagnosis = ' ' then do p = (c + 1) by 1;
if p > n then leave;
set have (keep = id red_test rename = (id = i red_test = r)) point = p nobs = n;
if id ne i then leave;
if red_test = 1 and r = 0 then leave;
if red_test = 1 and r = 1 then do;
diagnosis = 'RED ';
leave;
end;
end;
else if diagnosis in (' ', 'RED ') then do p = (c + 1) by 1;
if p > n then leave;
set have (keep = id blue_test rename = (id = i blue_test = b)) point = p nobs = n;
if id ne i then leave;
if blue_test = 1 and b = 0 then leave;
if blue_test = 1 and b = 1 then do;
diagnosis = 'BLUE';
leave;
end;
end;
retain diagnosis;
drop i r b;
run;
Result:
id session red_test blue_test diagnosis 1 1 0 0 1 3 1 0 RED 1 4 1 0 RED 1 5 1 1 RED 1 6 1 0 RED 2 1 1 0 RED 2 2 1 1 RED 2 3 0 0 RED 2 5 0 0 RED 2 7 0 1 BLUE 2 8 0 1 BLUE 2 9 0 0 BLUE 3 1 1 0 RED 3 2 . . RED 3 3 1 0 RED 3 4 0 1 RED
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.