Hello,
I need a help urgently with a large dataset that I am working on, I will like to be able to count number of events per minute, see below for an hypothetical data set and the result I am looking to get.
data temp;
input obs datetime : datetime id ;
format datetime datetime.;
datalines;
1 6/8/2011:00:00:30 1
2 6/8/2011:00:00:15 1
3 6/8/2011:00:01:00 1
4 6/8/2011:00:01:50 1
5 6/8/2011:00:01:10 1
6 6/8/2011:00:01:30 1
7 6/8/2011:00:02:00 1
8 6/8/2011:00:02:44 1
9 6/8/2011:00:00:17 1
10 6/8/2011:00:02:10 1
11 6/8/2011:00:03:21 1
12 6/8/2011:00:04:00 1
13 6/8/2011:00:03:32 1
;
run;
To count the number of activity within each minute, which means for a whole day I am expecting to have 1,440 rows of data
e.g
less or equal to 6/8/2011:00:01:00 4
less or equal to 6/8/2011:00:02:00 greater than 6/8/2011:00:01:00 4
less or equal to 6/8/2011:00:03:00 greater than 6/8/2011:00:02:00 2
less or equal to 6/8/2011:00:04:00 greater than 6/8/2011:00:03:00 3
Any help will be most appreciated.
Adeboye
How about:
options datestyle=mdy; data temp; input obs datetime : anydtdtm20. id ; minute=minute(datetime); format datetime datetime.; datalines; 1 6/8/2011:00:00:30 1 2 6/8/2011:00:00:15 1 3 6/8/2011:00:01:00 1 4 6/8/2011:00:01:50 1 5 6/8/2011:00:01:10 1 6 6/8/2011:00:01:30 1 7 6/8/2011:00:02:00 1 8 6/8/2011:00:02:44 1 9 6/8/2011:00:00:17 1 10 6/8/2011:00:02:10 1 11 6/8/2011:00:03:21 1 12 6/8/2011:00:04:00 1 ; run; proc sort data=temp; by minute; run; data want; set temp; by minute; retain count; if first.minute then count=0; count+1; if last.minute then output; run;
Ksharp
I would make the intervals go from 0-59 seconds, and use the INTNX function to create the minute grouping variable.
data temp;
input obs datetime : mdyampm. id;
minute = intnx('dtminute1',datetime,0,'b');
format datetime minute datetime.;
datalines;
0 6/8/2011:00:00:00 1
1 6/8/2011:00:00:30 1
2 6/8/2011:00:00:59 1
3 6/8/2011:00:01:00 1
4 6/8/2011:00:01:50 1
5 6/8/2011:00:01:10 1
6 6/8/2011:00:01:30 1
7 6/8/2011:00:02:00 1
8 6/8/2011:00:02:44 1
9 6/8/2011:00:00:17 1
10 6/8/2011:00:02:10 1
11 6/8/2011:00:03:21 1
12 6/8/2011:00:04:00 1
13 6/8/2011:00:03:32 1
;;;;
run;
proc print;
run;
Obs obs datetime id minute
1 0 08JUN11:00:00:00 1 08JUN11:00:00:00
2 1 08JUN11:00:00:30 1 08JUN11:00:00:00
3 2 08JUN11:00:00:59 1 08JUN11:00:00:00
4 3 08JUN11:00:01:00 1 08JUN11:00:01:00
5 4 08JUN11:00:01:50 1 08JUN11:00:01:00
6 5 08JUN11:00:01:10 1 08JUN11:00:01:00
7 6 08JUN11:00:01:30 1 08JUN11:00:01:00
8 7 08JUN11:00:02:00 1 08JUN11:00:02:00
9 8 08JUN11:00:02:44 1 08JUN11:00:02:00
10 9 08JUN11:00:00:17 1 08JUN11:00:00:00
11 10 08JUN11:00:02:10 1 08JUN11:00:02:00
12 11 08JUN11:00:03:21 1 08JUN11:00:03:00
13 12 08JUN11:00:04:00 1 08JUN11:00:04:00
14 13 08JUN11:00:03:32 1 08JUN11:00:03:00
Thanks for this, it looks a lot neater. However, I will like to know if it is possible to group your output by 'minute'
so i can have something like this
minute count
08JUN11:00:00:00 4
08JUN11:00:01:00 4
08JUN11:00:02:00 3
08JUN11:00:03:00 2
08JUN11:00:04:00 1
Thanks for your help
Two options:
1) Use data step logic using lag() and increment counter yourself; or
2) Replace proc print with proc means should give you the summary:
proc means data=<whatever> missing nway;
class minute;
output out=summary(drop=_type_ rename=(_freq_=count));
run;
by default you'll get a listing. use the "noprint" option to suppress listing on proc means.
A dataset called "summary" will be created by the output statement. This is probably what you want for further analysis.
I though you would know that. You can also create the grouping without creating a new variable using datetime14. format.
data temp;
input obs datetime : mdyampm. id;
minute = intnx('dtminute1',datetime,0,'b');
format datetime minute datetime.;
datalines;
0 6/8/2011:00:00:00 1
1 6/8/2011:00:00:30 1
2 6/8/2011:00:00:59 1
3 6/8/2011:00:01:00 1
4 6/8/2011:00:01:50 1
5 6/8/2011:00:01:10 1
6 6/8/2011:00:01:30 1
7 6/8/2011:00:02:00 1
8 6/8/2011:00:02:44 1
9 6/8/2011:00:00:17 1
10 6/8/2011:00:02:10 1
11 6/8/2011:00:03:21 1
12 6/8/2011:00:04:00 1
13 6/8/2011:00:03:32 1
;;;;
run;
proc print;
run;
proc summary data=temp nway;
class id datetime;
format datetime datetime14.;
output out=sum1;
run;
proc summary data=temp nway;
class id minute;
output out=sum2;
run;
proc compare base=sum1 compare=sum2;
run;
Hi,
I tried couple of solutions given here, but I didn't get proper output. The below one is a tested solution.
DATA EVENT(KEEP=OBS ID XDATE MINUTE);
INPUT OBS XTEXT :$19. ID;
XDATE=INPUT(XTEXT,ANYDTDTM30.);
MINUTE=INTNX('MINUTE',XDATE,0);
FORMAT XDATE MINUTE DATETIME.;
DATALINES;
...
...
;
RUN;
PROC SORT DATA = EVENT;
BY MINUTE;
RUN;
PROC FREQ DATA = EVENT;
TABLES MINUTE;
RUN;
Hope this helps...!
Thanks
Dhanasekaran R
How about:
options datestyle=mdy; data temp; input obs datetime : anydtdtm20. id ; minute=minute(datetime); format datetime datetime.; datalines; 1 6/8/2011:00:00:30 1 2 6/8/2011:00:00:15 1 3 6/8/2011:00:01:00 1 4 6/8/2011:00:01:50 1 5 6/8/2011:00:01:10 1 6 6/8/2011:00:01:30 1 7 6/8/2011:00:02:00 1 8 6/8/2011:00:02:44 1 9 6/8/2011:00:00:17 1 10 6/8/2011:00:02:10 1 11 6/8/2011:00:03:21 1 12 6/8/2011:00:04:00 1 ; run; proc sort data=temp; by minute; run; data want; set temp; by minute; retain count; if first.minute then count=0; count+1; if last.minute then output; run;
Ksharp
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.