I have the following data set, I am trying to select only observations
with at least 1 positive and 1 negative outcome.
Data Diagnose;
Input @1 Patient_No $2.
@3 Date MMDDYY10.
@14 Visit_No $2.
@16 Outcome $12.;
Format Date MMDDYY10.;
Datalines;
1 10/21/2000 1 Positive
1 10/25/2000 2 Positive
1 11/01/2000 3 Negative
1 05/28/2001 4 Negative
2 11/22/2000 1 Positive
2 11/29/2000 2 Positive
2 12/28/2000 3 Positive
2 06/28/2001 4 Low positive
2 10/29/2001 5 Negative
3 12/12/2000 1 Positive
3 12/29/2000 2 Positive
3 02/21/2001 3 Positive
3 07/12/2001 4 Negative
3 08/29/2001 5 Positive
4 08/29/2001 1 Positive
4 09/05/2001 2 Positive
4 11/01/2001 3 Positive
;
run;
I tried the following code.
data a;
set diagnose;
where outcome = "Positive" and "Negative";
proc print;
run;
Yes, that works. Thank you for the addition of patient no 5.
Beware of the cases where there are more than one distinct outcome but only one of them is Positive or Negative (patient_no = 5 below) :
Data Diagnose;
Input @1 Patient_No $2.
@3 Date MMDDYY10.
@14 Visit_No $2.
@16 Outcome $12.;
Format Date MMDDYY10.;
Datalines;
1 10/21/2000 1 Positive
1 10/25/2000 2 Positive
1 11/01/2000 3 Negative
1 05/28/2001 4 Negative
2 11/22/2000 1 Positive
2 11/29/2000 2 Positive
2 12/28/2000 3 Positive
2 06/28/2001 4 Low positive
2 10/29/2001 5 Negative
3 12/12/2000 1 Positive
3 12/29/2000 2 Positive
3 02/21/2001 3 Positive
3 07/12/2001 4 Negative
3 08/29/2001 5 Positive
4 08/29/2001 1 Positive
4 09/05/2001 2 Positive
4 11/01/2001 3 Positive
5 11/22/2000 1 Low positive
5 11/29/2000 2 Negative
5 12/28/2000 3 Negative
5 06/28/2001 4 Negative
;
proc sql;
create table a as
select *
from diagnose
group by patient_no
having count( distinct
case outcome
when "Positive" then 1
when "Negative" then 2
else . end ) > 1
order by patient_no, visit_no;
select * from a;
quit;
Yes, that works. Thank you for the addition of patient no 5.
For a non SQL solution:
data b;
do until(last.patient_no);
set diagnose; by patient_no notsorted;
pos = pos or outcome = "Positive";
neg = neg or outcome = "Negative";
end;
do until(last.patient_no);
set diagnose; by patient_no notsorted;
if pos and neg then output;
end;
drop pos neg;
run;
proc print noobs; run;
Simple logic. At the start of a patient obs, pos is missing which is interpreted as False. The logic expression ensures that at each iteration over the visits of a patient, if pos is already True, it remains True (True OR <anything> = True), otherwise it becomes true if outcome="Positive".
Thus, after iterating over a patients' visits, pos = True if "Positive" was met at least once.
Since there's only one outcome value in your observations, it can never be positive and negative at the same time.
Define what you consider an "observation"; in SAS speak, an observation is a single row of data in a dataset.
Yes, but the above example is transposed and ordered, should have pointed that out.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.