Here is a data set of your "example" data for others to use as a start.
data have;
informat id $5. pos_test1-Pos_test3 f1.
Complete_date Closest Collect_date1 Collect_date2 Collect_date3 mmddyy10.;
format Complete_date Closest Collect_date1 Collect_date2 Collect_date3 mmddyy10.;
input id Pos_test1 Pos_test2 Pos_test3 Complete_date Closest Collect_date1 Collect_date2 Collect_date3;
datalines;
ID1 1 1 . 5/21/2021 6/20/2021 6/20/2021 6/10/2021 .
ID2 1 . . 3/2/2021 5/10/2021 5/10/2021 . .
ID3 0 0 1 6/4/2021 7/10/2021 7/10/2021 6/20/2021 6/22/2021
;
Now a question/comment: Why is the Collectdate_1 later then Collect_date2 but Collect_date_3 is later than 2 but before 1?
I think that if your collect_dates were in chronologic order (with the positive values correlating) this would be an extremely simple exercise.
As it is, you could use another array that holds the differences and search that.
If I understand your requirement this may be one way.
data want;
set have;
array pt (*) pos_test: ;
array cd (*) collect_date: ;
/* to hold the differences, since new variables need explicit size*/
array dd (3) ;
do i=1 to dim(pt);
if pt[i]=1 then dd[i]=(abs(cd[i] - complete_date));
end;
if min(of dd(*))> 0 then pos = whichn(min(of dd(*)),of dd(*));
if pos>0 then new_closest= cd[pos]; format new_closest mmddyy10.;
run;
The dd array holds the date differences only when the test is positive. Absolute values are used to get "closest".
The Whichn, and character version Whichc, returns the first position that the value in the first position is found occurs in a list of following values. The "if min(of " bit of code is to only try to find a minimum position when there is at least one positive test.
I also create a new_closest so you can actually compare with the current value.
I really didn't understand why you were restricting
... View more