If you could build a linktable that looks like this:
Link
OP_dummy
FU_dummy
MED_dummy
1
1
1
2
1
3
1
1
1
4
1
5
1
6
1
You could loop over the rows (L=1 to 6) with logic like this, where _nlink=6:
do L=1 to _nlink;
call missing (of _sentinel1--_sentinel2);
if _linkdates{L,1}^=. then set optest;
if _linkdates{L,2}^=. then set futest;
if _linkdates{L,3}^=. then set medtest;
output;
end;
The above assumes the _sentinel1 variable is to the left of the variables from the 3 datasets, and _sentinel2 to the right, and the two-way array _linkdates includes only the rightmost 3 columns above.
So the task obviously is to build that link table in a first pass of each ID, so that the second pass could use the code above: The below does that:
data optest; input id op date_op MMDDYY10.; format date_op MMDDYY10.; datalines;
1 1 02/12/2020
1 1 02/13/2020
1 1 03/03/2021
1 1 03/05/2021
1 1 03/11/2021
;run;
data futest; input id fu date_fu MMDDYY10.; format date_fu MMDDYY10.; datalines;
1 1 03/05/2020
1 1 03/16/2021
1 1 01/16/2022
;run;
data medtest; input id med date_med MMDDYY10.; format date_med MMDDYY10.; datalines;
1 1 03/25/2021
data want (drop=_: L) ;
retain _sentinel1 . ;
if 0 then set optest futest medtest;
retain _sentinel2 . ;
array _linkdates {40,3}; *Accomodate up to 40 links of 3 dates (date_op,date_fu,date_med)**;
_nlink=0;
do until (last.id);
set optest (keep=id date_op rename=(date_op=_srtdate) in=inop)
futest (keep=id date_fu rename=(date_fu=_srtdate) in=infu)
medtest (keep=id date_med rename=(date_med=_srtdate) in=inmd) ;
by id _srtdate;
if inop then do;
_nlink+1; **New link**;
_linkdates{_nlink,1}=_srtdate; ** Renamed from DATE_OP **;
end;
else if infu then do;
if _nlink=0 then L=1; **In case FUTEST is 1st record for this id**;
else do L=1 to _nlink until (_srtdate-30 <= _linkdates{L,1});
end;
if L>_nlink then _nlink=L; **New link**;
_linkdates{L,2}=_srtdate; *DATE_FU in second column;
end;
else if inmd then do;
if _nlink=0 then L=1; **In case MEDTEST is 1st record for this id**;
else do L=1 to _nlink until (_srtdate-30 <= _linkdates{L,1} and _srtdate>=_linkdates{L,2});
end;
if L>_nlink then _nlink=L; **New link**;
_linkdates{L,3}=_srtdate; *DATE_MED in third column;
end;
end;
do L=1 to _nlink;
call missing (of _sentinel1--_sentinel2);
if _linkdates{L,1}^=. then set optest;
if _linkdates{L,2}^=. then set futest;
if _linkdates{L,3}^=. then set medtest;
output;
end;
run;
Be sure to set the row size of the _linkdates array to accomodated the largest number of linkages expected for any single ID.
Note the else do loops have no internal code. They just have a stopping condition based on the date relationships you posted. If none of the pre-existing links satisfy the stopping condition, then the index variable L will be exactly 1 greater than the prior _nlinks value - indicating need for a new link id.
Actually the _linkdates table in the program doesn't used dummy variables in the tables cells. It uses date values.
... View more