Hi mkeintz, You have a very eye-opening discussion. I really appreciate it. Here are my answers to the questions you asked: How do you want to treat instances of missing value in collection_date? You've already implied that you want to keep them. Should they enter into your subsequent evaluation of date gaps? If a collection date is missing I'm gonna use Episode_date or result_date instead. If all data are missing, the case should be deleted. Also, your sample data seems to be already ordered by collection_date (except when collection_date is missing). Is this the actual condition of your data? No, that is not the actual condition of the data. Also, when reading your original post I thought you wanted to remove instances with gaps of LESS THAN 60 days. But your example suggests the opposite. Do you want to remove instances of MORE THAN 60 days? Are you just saying to keep the earliest date plus all dates within 60 days of it (plus missing dates)? Sorry, for the confusion. My fault. The rules for removing data are: The same patient with two different diseases -no matter what collection_date is, both events will be retained. Same patient with the same disease and different collection_date, the first event will be retained according to the date difference between the two "collection_date" and the "disease": For example, C-Auris is one event for the whole life, so for any difference, only the first case would be retained. For Acinetobacter, on the other hand, if the difference is more than 60 days, both events will be retained (if less the second event will be deleted). For CRE, the difference is more than 90 days instead of 60. If a patient has 3 collection dates, the first event will be retained. The second event will be retained according to the disease rule above. The third event will be retained if its collection_date difference from the second is more than 45 or 60 days, if less, it will be removed. So what is the rule when you have two separate date ranges of less than 60 days. I.e., what if your have a given individuals with collection dates on DAYS 1, 5 and DAYS 101,105. These dates have two gaps of 4 days and 1 gap of 96 da.ys. In this case, only events on days 1 and 105 will be retained. I've attached an excel sheet with a sample of the data. Not sure if this is what you asked for! Thank you
... View more