Hello,
I need a help urgently with a large dataset that I am working on, I will like to be able to count number of events per minute, see below for an hypothetical data set and the result I am looking to get.
data temp;
input obs datetime : datetime id ;
format datetime datetime.;
datalines;
1 6/8/2011:00:00:30 1
2 6/8/2011:00:00:15 1
3 6/8/2011:00:01:00 1
4 6/8/2011:00:01:50 1
5 6/8/2011:00:01:10 1
6 6/8/2011:00:01:30 1
7 6/8/2011:00:02:00 1
8 6/8/2011:00:02:44 1
9 6/8/2011:00:00:17 1
10 6/8/2011:00:02:10 1
11 6/8/2011:00:03:21 1
12 6/8/2011:00:04:00 1
13 6/8/2011:00:03:32 1
;
run;
To count the number of activity within each minute, which means for a whole day I am expecting to have 1,440 rows of data
e.g
less or equal to 6/8/2011:00:01:00 4
less or equal to 6/8/2011:00:02:00 greater than 6/8/2011:00:01:00 4
less or equal to 6/8/2011:00:03:00 greater than 6/8/2011:00:02:00 2
less or equal to 6/8/2011:00:04:00 greater than 6/8/2011:00:03:00 3
Any help will be most appreciated.
Adeboye
How about:
options datestyle=mdy; data temp; input obs datetime : anydtdtm20. id ; minute=minute(datetime); format datetime datetime.; datalines; 1 6/8/2011:00:00:30 1 2 6/8/2011:00:00:15 1 3 6/8/2011:00:01:00 1 4 6/8/2011:00:01:50 1 5 6/8/2011:00:01:10 1 6 6/8/2011:00:01:30 1 7 6/8/2011:00:02:00 1 8 6/8/2011:00:02:44 1 9 6/8/2011:00:00:17 1 10 6/8/2011:00:02:10 1 11 6/8/2011:00:03:21 1 12 6/8/2011:00:04:00 1 ; run; proc sort data=temp; by minute; run; data want; set temp; by minute; retain count; if first.minute then count=0; count+1; if last.minute then output; run;
Ksharp
I would make the intervals go from 0-59 seconds, and use the INTNX function to create the minute grouping variable.
data temp;
input obs datetime : mdyampm. id;
minute = intnx('dtminute1',datetime,0,'b');
format datetime minute datetime.;
datalines;
0 6/8/2011:00:00:00 1
1 6/8/2011:00:00:30 1
2 6/8/2011:00:00:59 1
3 6/8/2011:00:01:00 1
4 6/8/2011:00:01:50 1
5 6/8/2011:00:01:10 1
6 6/8/2011:00:01:30 1
7 6/8/2011:00:02:00 1
8 6/8/2011:00:02:44 1
9 6/8/2011:00:00:17 1
10 6/8/2011:00:02:10 1
11 6/8/2011:00:03:21 1
12 6/8/2011:00:04:00 1
13 6/8/2011:00:03:32 1
;;;;
run;
proc print;
run;
Obs obs datetime id minute
1 0 08JUN11:00:00:00 1 08JUN11:00:00:00
2 1 08JUN11:00:00:30 1 08JUN11:00:00:00
3 2 08JUN11:00:00:59 1 08JUN11:00:00:00
4 3 08JUN11:00:01:00 1 08JUN11:00:01:00
5 4 08JUN11:00:01:50 1 08JUN11:00:01:00
6 5 08JUN11:00:01:10 1 08JUN11:00:01:00
7 6 08JUN11:00:01:30 1 08JUN11:00:01:00
8 7 08JUN11:00:02:00 1 08JUN11:00:02:00
9 8 08JUN11:00:02:44 1 08JUN11:00:02:00
10 9 08JUN11:00:00:17 1 08JUN11:00:00:00
11 10 08JUN11:00:02:10 1 08JUN11:00:02:00
12 11 08JUN11:00:03:21 1 08JUN11:00:03:00
13 12 08JUN11:00:04:00 1 08JUN11:00:04:00
14 13 08JUN11:00:03:32 1 08JUN11:00:03:00
Thanks for this, it looks a lot neater. However, I will like to know if it is possible to group your output by 'minute'
so i can have something like this
minute count
08JUN11:00:00:00 4
08JUN11:00:01:00 4
08JUN11:00:02:00 3
08JUN11:00:03:00 2
08JUN11:00:04:00 1
Thanks for your help
Two options:
1) Use data step logic using lag() and increment counter yourself; or
2) Replace proc print with proc means should give you the summary:
proc means data=<whatever> missing nway;
class minute;
output out=summary(drop=_type_ rename=(_freq_=count));
run;
by default you'll get a listing. use the "noprint" option to suppress listing on proc means.
A dataset called "summary" will be created by the output statement. This is probably what you want for further analysis.
I though you would know that. You can also create the grouping without creating a new variable using datetime14. format.
data temp;
input obs datetime : mdyampm. id;
minute = intnx('dtminute1',datetime,0,'b');
format datetime minute datetime.;
datalines;
0 6/8/2011:00:00:00 1
1 6/8/2011:00:00:30 1
2 6/8/2011:00:00:59 1
3 6/8/2011:00:01:00 1
4 6/8/2011:00:01:50 1
5 6/8/2011:00:01:10 1
6 6/8/2011:00:01:30 1
7 6/8/2011:00:02:00 1
8 6/8/2011:00:02:44 1
9 6/8/2011:00:00:17 1
10 6/8/2011:00:02:10 1
11 6/8/2011:00:03:21 1
12 6/8/2011:00:04:00 1
13 6/8/2011:00:03:32 1
;;;;
run;
proc print;
run;
proc summary data=temp nway;
class id datetime;
format datetime datetime14.;
output out=sum1;
run;
proc summary data=temp nway;
class id minute;
output out=sum2;
run;
proc compare base=sum1 compare=sum2;
run;
Hi,
I tried couple of solutions given here, but I didn't get proper output. The below one is a tested solution.
DATA EVENT(KEEP=OBS ID XDATE MINUTE);
INPUT OBS XTEXT :$19. ID;
XDATE=INPUT(XTEXT,ANYDTDTM30.);
MINUTE=INTNX('MINUTE',XDATE,0);
FORMAT XDATE MINUTE DATETIME.;
DATALINES;
...
...
;
RUN;
PROC SORT DATA = EVENT;
BY MINUTE;
RUN;
PROC FREQ DATA = EVENT;
TABLES MINUTE;
RUN;
Hope this helps...!
Thanks
Dhanasekaran R
How about:
options datestyle=mdy; data temp; input obs datetime : anydtdtm20. id ; minute=minute(datetime); format datetime datetime.; datalines; 1 6/8/2011:00:00:30 1 2 6/8/2011:00:00:15 1 3 6/8/2011:00:01:00 1 4 6/8/2011:00:01:50 1 5 6/8/2011:00:01:10 1 6 6/8/2011:00:01:30 1 7 6/8/2011:00:02:00 1 8 6/8/2011:00:02:44 1 9 6/8/2011:00:00:17 1 10 6/8/2011:00:02:10 1 11 6/8/2011:00:03:21 1 12 6/8/2011:00:04:00 1 ; run; proc sort data=temp; by minute; run; data want; set temp; by minute; retain count; if first.minute then count=0; count+1; if last.minute then output; run;
Ksharp
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.