I have data where subjects come in for three separate visits during a study to track health issues.
I need to flag data that repeats the same issues for the same visit date of each subject.
EX:
Sub Visit Issue
1 1 A
1 2 B
1 2 C
1 3 A
2 1 A
2 2 C
2 2 B
2 2 C
2 3 D
Subject 2's second visit should be flagged because there are 2 issues C's posted in the data set.
Can you please post the expected output sample for the input sample? It leads me/perhaps some of us to assumptions that we can avoid
@Rcrowder wrote:
I have data where subjects come in for three separate visits during a study to track health issues.
I need to flag data that repeats the same issues for the same visit date of each subject.
EX:
Sub Visit Issue
1 1 A
1 2 B
1 2 C
1 3 A
2 1 A
2 2 C
2 2 B
2 2 C
2 3 D
Subject 2's second visit should be flagged because there are 2 issues C's posted in the data set.
Best is to provide example data in the form of a data step so we test code. Example:
data work.have; input Sub Visit Issue $; datalines; 1 1 A 1 2 B 1 2 C 1 3 A 2 1 A 2 2 C 2 2 B 2 2 C 2 3 D ;
Paste into a code box opened with the forum's {I} or "running man" icons to preserve formatting. the message windows will reformat text and sometimes the reformatting will change the data enough that the data step will not run.
One easy way to get just the problem records:
proc freq data=work.have noprint; tables sub*visit*issue/out=work.issuecount(where=(count>1)); run;
Or to set a flag, 1=True 0=False for duplicates:
proc sort data=work.have; by sub visit issue; run; data work.want; set work.have; by sub visit issue; flag= not first.issue ; run;
A By statement in the data set creates a number of Boolean variables named First. and Last. for each variable on the by statement. If the record is the first at a given level then it is True (or 1). So if something is not the first record with the same Issue value then negating that variable gets the flag.
Methinks the simplest is to use the SAS SQL's self-merge facility:
data have ;
input Sub Visit Issue :$1. ;
cards ;
1 1 A
1 2 B
1 2 C
1 3 A
2 1 A
2 2 C
2 2 B
2 2 C
2 3 D
;
run ;
proc sql ;
create table want as
select *, count(issue) > count(distinct issue) as flag
from have
group sub, visit
;
quit ;
Note: The original order of the records within (sub,visit) may change, for example:
Sub Visit Issue flag ----------------------------- 1 1 A 0 1 2 C 0 1 2 B 0 1 3 A 0 2 1 A 0 2 2 B 1 2 2 C 1 2 2 C 1 2 3 D 0
But since it's SQL, this is an expected behavior.
OTOH, since your data appear to be sorted by (sub,visit), you can get what you want and preserve the original record order using a simple double DoW loop:
data want (drop = _:) ;
do _n_ = 1 by 1 until (last.visit) ;
set have ;
by sub visit ;
if _n_ = 1 then _issue = issue ;
else if issue = _issue then flag = 1 ;
end ;
do _n_ = 1 to _n_ ;
set have ;
flag = ^^ flag ;
output ;
end ;
run ;
What you get as a result in the output is:
Sub Visit Issue flag ----------------------------- 1 1 A 0 1 2 B 0 1 2 C 0 1 3 A 0 2 1 A 0 2 2 C 1 2 2 B 1 2 2 C 1 2 3 D 0
However, your specs are rather vague, so there's a chance you want to flag only the records with the duplicate values of ISSUE rather than the entire (sub,visit) group where they are found. If this is the case, the hash object comes in handy, especially since its use doesn't assume that the input file is sorted - and it also preserves the original record order:
data want (drop = _:) ;
if _n_ = 1 then do ;
dcl hash h () ;
h.definekey ("sub", "visit", "issue") ;
h.definedata ("_q") ;
h.definedone () ;
do until (z) ;
set have end = z ;
if h.find() ne 0 then _q = 1 ;
else _q + 1 ;
h.replace() ;
end ;
end ;
set have ;
h.find() ;
flag = _q > 1 ;
run ;
As a result, you'll get:
Sub Visit Issue flag ----------------------------- 1 1 A 0 1 2 B 0 1 2 C 0 1 3 A 0 2 1 A 0 2 2 C 1 2 2 B 0 2 2 C 1 2 3 D 0
Kind regards
Paul D.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.