Use @Astounding's or @novinosrin''s solution because they are better, but I had to go for a pure SQL solution because that is what I do.
data test_scores;
input
@1 SUBJECT $1.
@3 TEST $7.
@10 YN $3.;
datalines;
1 Test1 Yes
1 Test2 Yes
1 Test2A Yes
1 Test3 Yes
2 Test1 Yes
2 Test2 Yes
2 Test2A No
2 Test3 Yes
2 Test4 Yes
3 Test1 Yes
3 Test2 Yes
3 Test2A Yes
3 Test3 Yes
;
run;
*we can identify the problem cases like this;
proc sql;
create table problems as
select a.*, 'Yes' as Discrepancy from
(
select subject, test, yn
from test_scores
where test = 'Test2' and YN = 'Yes'
union
select subject, test, yn
from test_scores
where test = 'Test2A' and YN = 'Yes'
) as a
group by subject
having count(subject) > 1;
quit;
*and mark them in the whole dataset like this;
proc sql;
create table all_with_marked_cases as
select a.subject, a.test, a.yn, b.discrepancy
from test_scores a left join problems b
on a.subject = b.subject and a.test = b.test and a.yn = b.yn;
quit;
Gives us
The SAS System
SUBJECT TEST YN Discrepancy
1 Test1 Yes
1 Test2 Yes Yes
1 Test2A Yes Yes
1 Test3 Yes
2 Test1 Yes
2 Test2 Yes
2 Test2A No
2 Test3 Yes
2 Test4 Yes
3 Test1 Yes
3 Test2 Yes Yes
3 Test2A Yes Yes
3 Test3 Yes
You may now return to your regularly scheduled programming.
... View more