Help using Base SAS procedures

Count in a 60 Minute window

Reply
Contributor
Posts: 66

Count in a 60 Minute window

Hi There.

 

I have a data set that contains a datetime value that ive created from separate date and time variables and a unique identifier "IP" which is character.

 

What i cant quite figure out is I want to show a count of the same "IP" in a 60 minute window.

 

so for example 

 

IP           Count

123456   6

654321   1

233242   7

 

Etc.

 

Any help as always is appreciated.

 

Stret

Super User
Posts: 10,283

Re: Count in a 60 Minute window

[ Edited ]

Please post an example of your data (see my footnotes for how to post data) from which your expected result is to be derived.

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
How to post code
Contributor
Posts: 66

Re: Count in a 60 Minute window

Posted in reply to KurtBremser

@KurtBremser wrote:

Please post an example of your data (see my footnotes for how to post data) from which your expected result is to be derived.


Hi there i've attached a sample excel sheet

 

Apologies.

Super User
Posts: 10,283

Re: Count in a 60 Minute window

Excel spreadsheets are not SAS datasets, and they cannot be used to reliably represent SAS datasets (no column attributes etc). Furthermore, they can be used to transport malware, so many corporate institutions block their download.

Please follow the hints contained in my footnotes. It is NOT rocket science.

 

Finally, I can perfectly read SAS code on my tablet without need for any special app.

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
How to post code
Contributor
Posts: 66

Re: Count in a 60 Minute window

Posted in reply to KurtBremser

@KurtBremser wrote:

Excel spreadsheets are not SAS datasets, and they cannot be used to reliably represent SAS datasets (no column attributes etc). Furthermore, they can be used to transport malware, so many corporate institutions block their download.

Please follow the hints contained in my footnotes. It is NOT rocket science.

 

Finally, I can perfectly read SAS code on my tablet without need for any special app.


Hello,

 

I'm not sure the rudeness was warranted however I have looked at the three links in your signature and while it may not be rocket science i do not see the easy way to post the data. 

 

This is obviously beyond my level of understanding, which I make no apologies for, I am just looking for a little assistance on something that i don't know.

 

Kind Regards

 

Stret

Super User
Super User
Posts: 9,599

Re: Count in a 60 Minute window

"I have a data set that contains a datetime value that ive created from separate date and time variables and a unique identifier "IP""

Therefore:

data have;
  input ip dt e8601dt.;
new_date datalines; 123456 2017-08-21T12:01:00 123456 2017-08-26T11:00:00 ... run;

This means we can run it on our machine to get data which looks like yours.  Without this we are guessing, and hence can't write tested code which accurately answers your question.  SAS is a data driven programming language, so the data is very important to any coding.

 

In terms of your question, step 1 would be to create a variable based on what your window is.  Maybe:

data want;
  set have;
  retain base group;
  by ip;
  if first.ip=1 then do;
    group=1;
    base=dt;
  end;
  else do;
    if dt-base > ... then do;
      group=group+1;
      base=dt;
    end;
  end;
run;
  

So this keeps a "base" time, and changes group and base only when something goes outside the 60 minute window - how you do that check is up to you.

Contributor
Posts: 66

Re: Count in a 60 Minute window


@RW9 wrote:

"I have a data set that contains a datetime value that ive created from separate date and time variables and a unique identifier "IP""

Therefore:

data have;
  input ip dt e8601dt.;
new_date datalines; 123456 2017-08-21T12:01:00 123456 2017-08-26T11:00:00 ... run;

This means we can run it on our machine to get data which looks like yours.  Without this we are guessing, and hence can't write tested code which accurately answers your question.  SAS is a data driven programming language, so the data is very important to any coding.

 

In terms of your question, step 1 would be to create a variable based on what your window is.  Maybe:

data want;
  set have;
  retain base group;
  by ip;
  if first.ip=1 then do;
    group=1;
    base=dt;
  end;
  else do;
    if dt-base > ... then do;
      group=group+1;
      base=dt;
    end;
  end;
run;
  

So this keeps a "base" time, and changes group and base only when something goes outside the 60 minute window - how you do that check is up to you.


 

 

i understand completely I wasn't myself trying to be rude I just didn't see any obvious instructions on how to do it for an amateur and don't appreciate the sarcastic comments. 

 

I really do appreciate the help.

 

I'm more than happy to though

Contributor
Posts: 66

Re: Count in a 60 Minute window


@RW9 wrote:

"I have a data set that contains a datetime value that ive created from separate date and time variables and a unique identifier "IP""

Therefore:

data have;
  input ip dt e8601dt.;
new_date datalines; 123456 2017-08-21T12:01:00 123456 2017-08-26T11:00:00 ... run;

This means we can run it on our machine to get data which looks like yours.  Without this we are guessing, and hence can't write tested code which accurately answers your question.  SAS is a data driven programming language, so the data is very important to any coding.

 

In terms of your question, step 1 would be to create a variable based on what your window is.  Maybe:

data want;
  set have;
  retain base group;
  by ip;
  if first.ip=1 then do;
    group=1;
    base=dt;
  end;
  else do;
    if dt-base > ... then do;
      group=group+1;
      base=dt;
    end;
  end;
run;
  

So this keeps a "base" time, and changes group and base only when something goes outside the 60 minute window - how you do that check is up to you.


 

Hi Again.

 

Ok i think I see what you mean now, apologies for my knowledge on this

 

so if you run this code 

 

data have;
input IP DateTime DATETIME18.;

datalines;
123456 31MAR18:23:58:52
123456 31MAR18:23:57:19
123456 31MAR18:23:55:46
654321 31MAR18:23:44:18
654321 31MAR18:23:35:11


run;

 

then it will give you some sample data that is in the same format as mine, is that was you require?

 

so the output of the count would be 

 

IP           Count in 60 mins

123456             3

654321             2

 

 

Stret

Super User
Posts: 10,283

Re: Count in a 60 Minute window

A simple and short approach might look like this:

proc sql;
create table want as
select ip, datepart(datetime) as date, hour(timepart(datetime)) as hour, count(*) as count
from have
group by ip, calculated date, calculated hour;
quit;

If you need a different definition of "hour" (the sixty-minute window), you might have to create "hour" in a preliminary (data) step.

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
How to post code
Valued Guide
Posts: 597

Re: Count in a 60 Minute window

When you say 60 Min, Which 60 min do you want to consider? Is  it from the first timestamp value, then look at this code:

 

data have;
Format DateTime DATETIME18.;
input IP DateTime DATETIME18.;
datalines;
123456 31MAR18:21:58:52
123456 31MAR18:21:57:19
123456 31MAR18:21:55:46
123456 31MAR18:23:55:46
123456 31MAR18:23:58:46
654321 31MAR18:23:44:18
654321 31MAR18:23:35:11
;
run;
/* Sort the data first */
PROC SORT DATA=have;
by IP DateTime;
run;

DATA WANT;
Format lag_Date DATETIME18.;
Retain Time;
set have;
by  IP DateTime;
lag_Date=LAG(DateTime); /* Previous DateTime to Compare with current DateTime */
If first.IP then Do;
		lag_Date=DateTime;
		Time=0;
		Count=1;
		end;
else Time+INTCK('SECONDS',lag_Date,DateTime); /* Number of seconds from previous Datetime to current Datetime */
If time<=60*60 and time^=0 then count+1;
If last.IP;
Keep IP Count;
run;

else, if you want to count every hourly then use HOUR() for hour and sort the data  by IP DATE and Hour and use FIRST. for these by variables for count.

data want2;
set have;
hour=hour(DateTime);
Date=datepart(Datetime);
run;

Proc sort data=want2;
by IP Date Hour;
run;
data final;
set want2;
by IP Date Hour;
if first.Hour then count=1;
else count+1;
if last.hour;
keep IP Hour Count;
run;
Thanks,
Suryakiran
Ask a Question
Discussion stats
  • 9 replies
  • 169 views
  • 0 likes
  • 4 in conversation