Greetings,
I have a rather peculiar data situation that I am stuck on.
I have a list of sample id's that undergoes 2 tests during each session: a RED & BLUE test. The test result values are either 0 (negative) or 1 (positive).
To make a diagnosis based on either tests, 2 consecutive positive tests results are required. However, there are sessions where the test results are missing (denoted by "."). It is OK for these "missing" sessions to be in between 2 consecutive positive results.
The data looks like:
Note that the test diagnosis is currently made at the 2nd consecutive positive result session. Furthermore, after a diagnosis is made, it is kept through the sample's following sessions, until a diagnosis based on a different test is made by the consecutive rule (see ID 2).
I would like to update the diagnosis column to where the test diagnosis is made at the 1st of the 2-consecutive-positive-result sessions, as indicated in the last column "desired diagnosis output".
How would I go about updating the previous session rows?
Thank you for your help!
How about this? I assume that "Blue" Trumps "Red" and that only those two are considered.
Feel free to ask 🙂
data have;
input id session red_test blue_test;
datalines;
1 1 0 0
1 3 1 0
1 4 1 0
1 5 1 1
1 6 1 0
2 1 1 0
2 2 1 1
2 3 0 0
2 5 0 0
2 7 0 1
2 8 0 1
2 9 0 0
3 1 1 0
3 2 . .
3 3 1 0
3 4 0 1
;
data want;
set have curobs = c;
by id;
if first.id then diagnosis = ' ';
if diagnosis = ' ' then do p = (c + 1) by 1;
if p > n then leave;
set have (keep = id red_test rename = (id = i red_test = r)) point = p nobs = n;
if id ne i then leave;
if red_test = 1 and r = 0 then leave;
if red_test = 1 and r = 1 then do;
diagnosis = 'RED ';
leave;
end;
end;
else if diagnosis in (' ', 'RED ') then do p = (c + 1) by 1;
if p > n then leave;
set have (keep = id blue_test rename = (id = i blue_test = b)) point = p nobs = n;
if id ne i then leave;
if blue_test = 1 and b = 0 then leave;
if blue_test = 1 and b = 1 then do;
diagnosis = 'BLUE';
leave;
end;
end;
retain diagnosis;
drop i r b;
run;
Result:
id session red_test blue_test diagnosis 1 1 0 0 1 3 1 0 RED 1 4 1 0 RED 1 5 1 1 RED 1 6 1 0 RED 2 1 1 0 RED 2 2 1 1 RED 2 3 0 0 RED 2 5 0 0 RED 2 7 0 1 BLUE 2 8 0 1 BLUE 2 9 0 0 BLUE 3 1 1 0 RED 3 2 . . RED 3 3 1 0 RED 3 4 0 1 RED
How about this? I assume that "Blue" Trumps "Red" and that only those two are considered.
Feel free to ask 🙂
data have;
input id session red_test blue_test;
datalines;
1 1 0 0
1 3 1 0
1 4 1 0
1 5 1 1
1 6 1 0
2 1 1 0
2 2 1 1
2 3 0 0
2 5 0 0
2 7 0 1
2 8 0 1
2 9 0 0
3 1 1 0
3 2 . .
3 3 1 0
3 4 0 1
;
data want;
set have curobs = c;
by id;
if first.id then diagnosis = ' ';
if diagnosis = ' ' then do p = (c + 1) by 1;
if p > n then leave;
set have (keep = id red_test rename = (id = i red_test = r)) point = p nobs = n;
if id ne i then leave;
if red_test = 1 and r = 0 then leave;
if red_test = 1 and r = 1 then do;
diagnosis = 'RED ';
leave;
end;
end;
else if diagnosis in (' ', 'RED ') then do p = (c + 1) by 1;
if p > n then leave;
set have (keep = id blue_test rename = (id = i blue_test = b)) point = p nobs = n;
if id ne i then leave;
if blue_test = 1 and b = 0 then leave;
if blue_test = 1 and b = 1 then do;
diagnosis = 'BLUE';
leave;
end;
end;
retain diagnosis;
drop i r b;
run;
Result:
id session red_test blue_test diagnosis 1 1 0 0 1 3 1 0 RED 1 4 1 0 RED 1 5 1 1 RED 1 6 1 0 RED 2 1 1 0 RED 2 2 1 1 RED 2 3 0 0 RED 2 5 0 0 RED 2 7 0 1 BLUE 2 8 0 1 BLUE 2 9 0 0 BLUE 3 1 1 0 RED 3 2 . . RED 3 3 1 0 RED 3 4 0 1 RED
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.