The SAS code I am using is the following:
PROC SUMMARY DATA=INTPS missing;
CLASS HH;
VAR HH;
OUTPUT OUT=TPS
MEAN=AVGTPS
MAX=PEAKTPS
/NOINHERIT;
Please supply an example for your data in usable form (data step with datalines, posted into a code box opened with the "little running man").
Since you only use one variable (which is suspicious), we need to know what is contained in HH and its attributes.
HH is hours - the input dataset contains 1 entry per transaction processed - MM is minutes, SS is seconds. This input dataset contains over 100 million records. What I want to calculate is transaction arrival rate by hour.
Define "arrival rate".
Transactions per second can be calculated like this:
data intps;
input hh mm ss;
datalines;
3 59 59.807316959
3 59 59.813843966
3 59 59.823140979
4 0 0.123
4 0 0.345
;
data transactions;
set intps end = done;
retain hour min sec count;
sec = floor(ss);
if hour ne hh or min ne mm or sec ne floor(ss)
then do;
if count > 0 then output;
count = 0;
hour = hh;
min = mm;
sec = floor(ss);
end;
count + 1;
if done then output;
keep hour min sec count;
run;
proc summary data=transactions;
by hour;
var count;
output out=want (drop=_type_ _freq_) mean()=avgtps max()=peaktps;
run;
Note that I use the rather complicated data step to avoid using SQL or an extra step that calculates the second without the fractions. You could replace it by
proc sql;
create table transactions as
select
hh,
mm,
floor(ss) as ss,
count(*) as count
from intps
group by hh, mm, calculated ss
;
quit;
and then use hh as BY variable in the SUMMARY. Which method you use should be determined by test runs with your production data.
You may also contemplate some method to insert "0" observations for missing seconds, so that you always have 3600 observations per hour.
I ran the code, and AVGTPS and PEAKTPS don't look right - They show the same values that I got when I ran my own version:
Hour Trancount Average TPS Peak TPS
3 188 3 3
4 2191715 4 4
5 2433846 5 5
....
21 3741828 21 21
Doing the math would show that for hour 21, the average TPS would be 1039
(3,741,828 / 3600 (60 sec * 60 minutes)).
How did you calculate trancount?
With the data you supplied, my code produces correct results.
So you first need to count the number of observations per second, and then calculate the max and mean per hour?
Once again I ask for example data. I will not ask a third time.
Sample data:
3 59 59.807316959
3 59 59.813843966
3 59 59.823140979
...
23 59 59.904180944
My sample dataset 9865385 records - a full dataset will contain over 100 million records.
3 59 59.807316959
3 59 59.813843966
3 59 59.823140979
4 0 0.123
4 0 0.345
The first column is the hour, the second column is the minute, the last column is the second with the subseconds following the period. The dataset has over 100 million observations.
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.