## Selecting observations

Solved
Occasional Contributor
Posts: 7

# Selecting observations

I have the following data set, I am trying to select only observations

with at least 1 positive and 1 negative outcome.

``````Data Diagnose;
Input @1 Patient_No \$2.
@3 Date MMDDYY10.
@14 Visit_No \$2.
@16 Outcome \$12.;
Format Date MMDDYY10.;
Datalines;
1 10/21/2000 1 Positive
1 10/25/2000 2 Positive
1 11/01/2000 3 Negative
1 05/28/2001 4 Negative
2 11/22/2000 1 Positive
2 11/29/2000 2 Positive
2 12/28/2000 3 Positive
2 06/28/2001 4 Low positive
2 10/29/2001 5 Negative
3 12/12/2000 1 Positive
3 12/29/2000 2 Positive
3 02/21/2001 3 Positive
3 07/12/2001 4 Negative
3 08/29/2001 5 Positive
4 08/29/2001 1 Positive
4 09/05/2001 2 Positive
4 11/01/2001 3 Positive
;
run;``````

I tried the following code.

data a;
set diagnose;
where outcome = "Positive" and "Negative";
proc print;
run;

Accepted Solutions
Solution
‎11-12-2017 02:19 AM
Occasional Contributor
Posts: 7

## Re: Selecting observations

Yes, that works. Thank you for the addition of patient no 5.

All Replies
Posts: 5,523

## Re: Selecting observations

Beware of the cases where there are more than one distinct outcome but only one of them is Positive or Negative (patient_no = 5 below) :

``````Data Diagnose;
Input @1 Patient_No \$2.
@3 Date MMDDYY10.
@14 Visit_No \$2.
@16 Outcome \$12.;
Format Date MMDDYY10.;
Datalines;
1 10/21/2000 1 Positive
1 10/25/2000 2 Positive
1 11/01/2000 3 Negative
1 05/28/2001 4 Negative
2 11/22/2000 1 Positive
2 11/29/2000 2 Positive
2 12/28/2000 3 Positive
2 06/28/2001 4 Low positive
2 10/29/2001 5 Negative
3 12/12/2000 1 Positive
3 12/29/2000 2 Positive
3 02/21/2001 3 Positive
3 07/12/2001 4 Negative
3 08/29/2001 5 Positive
4 08/29/2001 1 Positive
4 09/05/2001 2 Positive
4 11/01/2001 3 Positive
5 11/22/2000 1 Low positive
5 11/29/2000 2 Negative
5 12/28/2000 3 Negative
5 06/28/2001 4 Negative
;

proc sql;
create table a as
select *
from diagnose
group by patient_no
having count( distinct
case outcome
when "Positive" then 1
when "Negative" then 2
else . end ) > 1
order by patient_no, visit_no;
select * from a;
quit;``````
PG
Solution
‎11-12-2017 02:19 AM
Occasional Contributor
Posts: 7

## Re: Selecting observations

Yes, that works. Thank you for the addition of patient no 5.

Posts: 5,523

## Re: Selecting observations

For a non SQL solution:

``````data b;
do until(last.patient_no);
set diagnose; by patient_no notsorted;
pos = pos or outcome = "Positive";
neg = neg or outcome = "Negative";
end;
do until(last.patient_no);
set diagnose; by patient_no notsorted;
if pos and neg then output;
end;
drop pos neg;
run;

proc print noobs; run;``````
PG
Super Contributor
Posts: 285

## Re: Selecting observations

@ PGStats, can you explan how does this code works?
pos = pos or outcome = "Positive";
I tried below, and it does not work.
pos = (outcome = "Positive");
Posts: 5,523

## Re: Selecting observations

Simple logic. At the start of a patient obs, pos is missing which is interpreted as False. The logic expression ensures that at each iteration over the visits of a patient, if pos is already True, it remains True (True OR <anything> = True), otherwise it becomes true if outcome="Positive".

Thus, after iterating over a patients' visits, pos = True if "Positive" was met at least once.

PG
Super User
Posts: 10,217

## Re: Selecting observations

Since there's only one outcome value in your observations, it can never be positive and negative at the same time.

Define what you consider an "observation"; in SAS speak, an observation is a single row of data in a dataset.

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
How to post code
Occasional Contributor
Posts: 7