I completely understand the difficulty, apologies! It is a protected dataset, but it looks something like this once in SAS and cleaned: Pos_test1 Pos_test2 Pos_test3 Complete_date Closest Collect_date1 Collect_date2 Collect_date3 ID1 1 1 . 5/21/2021 6/20/2021 6/20/2021 6/10/2021 . ID2 1 . . 3/2/2021 5/10/2021 5/10/2021 . . ID3 0 0 1 6/4/2021 7/10/2021 7/10/2021 6/20/2021 6/22/2021 Maybe this helps to visualize. Each individual can have up to 50 tests, but many have only 1, or just a few. Thank you for the tip about the 1, an easy oversight! However, that was intentional. Oftentimes the collect_date1 is the date desired, so I set the baseline at collect_date1. Then, in later steps, I attempted to write that IF any other collect_date[2-50] are closer in difference from the complete_date than collect_date1, that should be chosen as "closest" instead. The issue with the current code is that for ID1, it chooses collect_date1 as the closest date to the complete_date, instead of collect_date2. ID2 is correct by default, but then the issue returns with ID3. It should ignore collect_date1, because pos_test1 is 0. However, it accurately recognizes that collect_date3 and pos_test3 should be included, and then it returns the value of collect_date1 as the closest value. I'm not sure why it isn't overwriting the original closest variable when it finds a closest date. Maybe I don't need the placeholder of the first 2 collect_date1 lines? I will try that to see if it fixes it.
... View more