DATA Step, Macro, Functions and more

Count observation by ID over rolling 3 day time period

Accepted Solution Solved
Reply
Occasional Contributor
Posts: 7
Accepted Solution

Count observation by ID over rolling 3 day time period

[ Edited ]

I would like to count the occurrences of an IP address over a rolling three day time frame.  Below is sample data.

 

ID_NumberIP_AddressLogon_Time
199.98.127.20806DEC2017:10:55:01.855000
299.98.145.4512DEC2013:00:32:57.354000
399.98.145.4503JAN2014:13:53:06.453000
499.98.145.4503JAN2014:14:10:51.439000
599.98.145.4519JAN2014:04:44:20.043000
699.98.148.18515NOV2017:23:42:09.202000
799.98.148.18504DEC2017:22:52:47.562000
899.98.148.18511JAN2018:15:12:00.020000
999.98.148.18511JAN2018:16:13:25.536000
1099.98.148.18512JAN2018:16:13:25.536000
1199.98.148.18513JAN2018:16:13:25.536000
1299.98.148.18514JAN2018:16:13:25.536000
1399.98.148.18513MAR2018:20:50:20.437000
1499.98.171.5614JUN2016:22:30:45.895000
1599.98.171.5607JUL2016:20:17:28.182000
1699.98.171.5607JUL2016:21:02:56.512000
1799.98.237.21416JUL2012:18:27:58.504000
1899.98.237.21417JUL2012:19:50:17.975000

 

The output should look like the following:

 

ID_NumberIP_AddressLogon_TimeCount 
199.98.127.20806DEC2017:10:55:01.8550001
299.98.145.4512DEC2013:00:32:57.3540001
399.98.145.4503JAN2014:13:53:06.4530002
499.98.145.4503JAN2014:14:10:51.4390002
599.98.145.4519JAN2014:04:44:20.0430001
699.98.148.18515NOV2017:23:42:09.2020001
799.98.148.18504DEC2017:22:52:47.5620001
899.98.148.18511JAN2018:15:12:00.0200004
999.98.148.18511JAN2018:16:13:25.5360004
1099.98.148.18512JAN2018:16:13:25.5360005
1199.98.148.18513JAN2018:16:13:25.5360005
1299.98.148.18514JAN2018:16:13:25.5360003
1399.98.148.18513MAR2018:20:50:20.4370001
1499.98.171.5614JUN2016:22:30:45.8950001
1599.98.171.5607JUL2016:20:17:28.1820002
1699.98.171.5607JUL2016:21:02:56.5120002
1799.98.237.21416JUL2012:18:27:58.5040002
1899.98.237.21417JUL2012:19:50:17.9750002

 

The code should be able to do a count of both 3 days ahead and 3 days behind.  See ID_Numbers 8-12 above.


Accepted Solutions
Solution
‎04-20-2018 12:48 PM
PROC Star
Posts: 1,357

Re: Count observation by ID over rolling 3 day time period

[ Edited ]
data have;
infile datalines truncover;
input (ID_Number	IP_Address)(:$50.)	Logon_Time : datetime30.;
format Logon_Time  datetime30.;
datalines;
1	99.98.127.208	06DEC2017:10:55:01.855000
2	99.98.145.45	12DEC2013:00:32:57.354000
3	99.98.145.45	03JAN2014:13:53:06.453000
4	99.98.145.45	03JAN2014:14:10:51.439000
5	99.98.145.45	19JAN2014:04:44:20.043000
6	99.98.148.185	15NOV2017:23:42:09.202000
7	99.98.148.185	04DEC2017:22:52:47.562000
8	99.98.148.185	11JAN2018:15:12:00.020000
9	99.98.148.185	11JAN2018:16:13:25.536000
10	99.98.148.185	12JAN2018:16:13:25.536000
11	99.98.148.185	13JAN2018:16:13:25.536000
12	99.98.148.185	14JAN2018:16:13:25.536000
13	99.98.148.185	13MAR2018:20:50:20.437000
14	99.98.171.56	14JUN2016:22:30:45.895000
15	99.98.171.56	07JUL2016:20:17:28.182000
16	99.98.171.56	07JUL2016:21:02:56.512000
17	99.98.237.214	16JUL2012:18:27:58.504000
18	99.98.237.214	17JUL2012:19:50:17.975000
;


data want;
if _n_=1 then do;
if 0 then set have;
dcl hash H (dataset:'have',multidata:'y') ;
   h.definekey  ("IP_Address") ;
   h.definedata ("Logon_Time") ;
   h.definedone () ;
end;
set have(rename=(Logon_Time=_Logon_Time));
count=0;
rc = h.find();
do while(rc = 0);
if intnx('days',datepart(_Logon_Time),-3+1)<=datepart(Logon_Time)<=intnx('days',datepart(_Logon_Time),+3-1) then count+1;
rc = h.find_next();
end;
drop rc Logon_Time;
run;

View solution in original post


All Replies
Super User
Posts: 22,874

Re: Count observation by ID over rolling 3 day time period

Is your time frame three days ahead then? Three days back or a plus-minus 3 day window on either side?

 

For a record today, would the interval be from April 17 to April 23 inclusive or would it be April 18 to April 22 inclusive?

 

Do you have a SAS /ETS license?


@Sth19 wrote:

I would like to count the occurrences of an IP address over a rolling three day time frame.  Below is sample data.

 

ID_Number IP_Address Logon_Time
1 99.98.127.208 06DEC2017:10:55:01.855000
2 99.98.145.45 12DEC2013:00:32:57.354000
3 99.98.145.45 03JAN2014:13:53:06.453000
4 99.98.145.45 03JAN2014:14:10:51.439000
5 99.98.145.45 19JAN2014:04:44:20.043000
6 99.98.148.185 15NOV2017:23:42:09.202000
7 99.98.148.185 04DEC2017:22:52:47.562000
8 99.98.148.185 11JAN2018:15:12:00.020000
9 99.98.148.185 11JAN2018:16:13:25.536000
10 99.98.148.185 12JAN2018:16:13:25.536000
11 99.98.148.185 13JAN2018:16:13:25.536000
12 99.98.148.185 14JAN2018:16:13:25.536000
13 99.98.148.185 13MAR2018:20:50:20.437000
14 99.98.171.56 14JUN2016:22:30:45.895000
15 99.98.171.56 07JUL2016:20:17:28.182000
16 99.98.171.56 07JUL2016:21:02:56.512000
17 99.98.237.214 16JUL2012:18:27:58.504000
18 99.98.237.214 17JUL2012:19:50:17.975000

 

The output should look like the following:

 

ID_Number IP_Address Logon_Time Count 
1 99.98.127.208 06DEC2017:10:55:01.855000 1
2 99.98.145.45 12DEC2013:00:32:57.354000 1
3 99.98.145.45 03JAN2014:13:53:06.453000 2
4 99.98.145.45 03JAN2014:14:10:51.439000 2
5 99.98.145.45 19JAN2014:04:44:20.043000 1
6 99.98.148.185 15NOV2017:23:42:09.202000 1
7 99.98.148.185 04DEC2017:22:52:47.562000 1
8 99.98.148.185 11JAN2018:15:12:00.020000 4
9 99.98.148.185 11JAN2018:16:13:25.536000 4
10 99.98.148.185 12JAN2018:16:13:25.536000 5
11 99.98.148.185 13JAN2018:16:13:25.536000 5
12 99.98.148.185 14JAN2018:16:13:25.536000 3
13 99.98.148.185 13MAR2018:20:50:20.437000 1
14 99.98.171.56 14JUN2016:22:30:45.895000 1
15 99.98.171.56 07JUL2016:20:17:28.182000 2
16 99.98.171.56 07JUL2016:21:02:56.512000 2
17 99.98.237.214 16JUL2012:18:27:58.504000 2
18 99.98.237.214 17JUL2012:19:50:17.975000 2

 

The code should be able to do a count of both 3 days ahead and 3 days behind.  See ID_Numbers 8-12 above.


 

Solution
‎04-20-2018 12:48 PM
PROC Star
Posts: 1,357

Re: Count observation by ID over rolling 3 day time period

[ Edited ]
data have;
infile datalines truncover;
input (ID_Number	IP_Address)(:$50.)	Logon_Time : datetime30.;
format Logon_Time  datetime30.;
datalines;
1	99.98.127.208	06DEC2017:10:55:01.855000
2	99.98.145.45	12DEC2013:00:32:57.354000
3	99.98.145.45	03JAN2014:13:53:06.453000
4	99.98.145.45	03JAN2014:14:10:51.439000
5	99.98.145.45	19JAN2014:04:44:20.043000
6	99.98.148.185	15NOV2017:23:42:09.202000
7	99.98.148.185	04DEC2017:22:52:47.562000
8	99.98.148.185	11JAN2018:15:12:00.020000
9	99.98.148.185	11JAN2018:16:13:25.536000
10	99.98.148.185	12JAN2018:16:13:25.536000
11	99.98.148.185	13JAN2018:16:13:25.536000
12	99.98.148.185	14JAN2018:16:13:25.536000
13	99.98.148.185	13MAR2018:20:50:20.437000
14	99.98.171.56	14JUN2016:22:30:45.895000
15	99.98.171.56	07JUL2016:20:17:28.182000
16	99.98.171.56	07JUL2016:21:02:56.512000
17	99.98.237.214	16JUL2012:18:27:58.504000
18	99.98.237.214	17JUL2012:19:50:17.975000
;


data want;
if _n_=1 then do;
if 0 then set have;
dcl hash H (dataset:'have',multidata:'y') ;
   h.definekey  ("IP_Address") ;
   h.definedata ("Logon_Time") ;
   h.definedone () ;
end;
set have(rename=(Logon_Time=_Logon_Time));
count=0;
rc = h.find();
do while(rc = 0);
if intnx('days',datepart(_Logon_Time),-3+1)<=datepart(Logon_Time)<=intnx('days',datepart(_Logon_Time),+3-1) then count+1;
rc = h.find_next();
end;
drop rc Logon_Time;
run;
Super Contributor
Super Contributor
Posts: 265

Re: Count observation by ID over rolling 3 day time period

For record 12, you show a count of 3 in the desired results. 

 

Should that be 4?  I think records 9, 10, 11, and 12 all fall within the 3 day time frame, but I may be wrong. Record 8 is too far back and Record 13 is too far forward.  It may be an < vs <= thing. 

Super User
Posts: 10,626

Re: Count observation by ID over rolling 3 day time period

data have;
infile datalines truncover;
input (ID_Number	IP_Address)(:$50.)	Logon_Time : datetime30.;
format Logon_Time  datetime30.;
datalines;
1	99.98.127.208	06DEC2017:10:55:01.855000
2	99.98.145.45	12DEC2013:00:32:57.354000
3	99.98.145.45	03JAN2014:13:53:06.453000
4	99.98.145.45	03JAN2014:14:10:51.439000
5	99.98.145.45	19JAN2014:04:44:20.043000
6	99.98.148.185	15NOV2017:23:42:09.202000
7	99.98.148.185	04DEC2017:22:52:47.562000
8	99.98.148.185	11JAN2018:15:12:00.020000
9	99.98.148.185	11JAN2018:16:13:25.536000
10	99.98.148.185	12JAN2018:16:13:25.536000
11	99.98.148.185	13JAN2018:16:13:25.536000
12	99.98.148.185	14JAN2018:16:13:25.536000
13	99.98.148.185	13MAR2018:20:50:20.437000
14	99.98.171.56	14JUN2016:22:30:45.895000
15	99.98.171.56	07JUL2016:20:17:28.182000
16	99.98.171.56	07JUL2016:21:02:56.512000
17	99.98.237.214	16JUL2012:18:27:58.504000
18	99.98.237.214	17JUL2012:19:50:17.975000
;
proc sql;
create table want as
 select *,(select count(IP_Address) from have where Logon_Time 
 between intnx('dtday',a.Logon_Time,-3) and  intnx('dtday',a.Logon_Time,3)) as count
  from have as a;
quit;
Super Contributor
Super Contributor
Posts: 265

Re: Count observation by ID over rolling 3 day time period

I was trying to come up with this but didn't know about the intnx function (thanks!) so had to go the long way round through multiple queries. This is neat.
☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 5 replies
  • 181 views
  • 5 likes
  • 5 in conversation