I have posted a similar query in a different thread and have received some great responses. My requirements are now slightly different than what I initially had. I have a sample dataset (below) that links an IP to an Account login attempt. I want to identify any IP (alongside the Account and Date) that has attempted to log into three or more accounts within any 7 day window. data have; infile cards dsd ; input IP :$30. Account $ Date :datetime20.; format date datetime20.; cards; 108.950.45, 12345, 01MAY2020:11:39:55 108.950.45, 54321, 09MAY2020:12:55:22 108.950.45, 67890, 10MAY2020:03:00:02 108.950.45, 09876, 11MAY2020:06:14:03 108.950.45, 45879, 11MAY2020:12:37:35 108.950.45, 78562, 13MAY2020:13:13:09 108.950.45, 54321, 17MAY2020:07:12:03 108.950.45, 54686, 30MAY2020:10:00:02 220.540.40, 45879, 05MAY2020:09:09:15 220.540.40, 94532, 11MAY2020:03:41:16 220.540.40, 45879, 11MAY2020:14:00:58 110.640.35, 54873, 05MAY2020:17:22:01 110.640.35, 56454, 05MAY2020:19:22:05 110.640.35, 87654, 20MAY2020:05:14:12 ; My output would hence ideally read: 108.950.45, 54321, 09MAY2020:12:55:22 108.950.45, 54321, 10MAY2020:03:00:02 108.950.45, 09876, 11MAY2020:06:14:03 108.950.45, 45879, 11MAY2020:12:37:35 108.950.45, 78562, 13MAY2020:13:13:09 108.950.45, 67890, 17MAY2020:07:12:03 In the above output the following criteria is satisfied: (i) three or more unique Account IDs to the IP; (ii) login attempts occurred within a 7 day window. Even though rows with dates 09MAY2020 and 17MAY2020 are more than seven days a part, they are still associated to a 7 day streak. Thank you!
... View more