BookmarkSubscribeRSS Feed
chris_anon1
Calcite | Level 5

Hey everyone, I'm pretty new to SAS, and am having trouble calculating the mean time for a dataset.

My data is in military time, but the problem is that the central peak of the distribution is centered at midnight. So, instead of having a single peak at midnight, SAS gives me two peaks: one around 11:59 PM and another around 12:01 AM (or, as my dataset has, one peak at 2359 and another 0001).

I was wondering how I might code SAS to realize that time is continuous through midnight.

By the way, my code so far looks like this:

proc univariate data=timesr;

var hour;

histogram;

run;

Thank you in advance!

9 REPLIES 9
jakarman
Barite | Level 11

Basically SAS is using a continuous time either in seconds (floating number) or in days (floating number) wit a point of zero at 1 jan 1960 (make that Zulu time going into NLS).  SAS(R) 9.4 Language Reference: Concepts, Fourth Edition  Whatever you have done with the var hour is something that disturbed that concepts.

Another basic one with SAS is that you do not need to recode content/real values but can apply formatting. SAS(R) 9.4 Formats and Informats: Reference
SAS(R) 9.4 Formats and Informats: Reference Limiting the length to hour or defining a own format handling that should solve your question.

There is not that information on the data you are processing, please give more info.....

---->-- ja karman --<-----
Reeza
Super User

Does your data have a date component?

If so, add a date to your time variable and then format it to show only the time. 

Another method would be to centre the data around midnight and then draw the histogram.

chris_anon1
Calcite | Level 5

I've added the corresponding dates, but I'm not sure how to format it to show only hour. Any help you can offer with this?

gergely_batho
SAS Employee

format t tod2.;

chris_anon1
Calcite | Level 5

I'd hate to do this, but every time I try this format, my datatable is simply replaced with 1 observation that reads ".".

I've converted all my times to the standard SAS serial code for time (i.e. seconds after midnight), but to no avail.

gergely_batho
SAS Employee

Are you sure you have proper SAS datetime value in the variable?

What happens if you put the datetime20. format on it? Does it work?

data _null_;

  dt=datetime();

  put dt:datetime20.;

  put dt:tod2.;

run;

What is your plan about determining the "mean time"? (Which I think is rather  the peak hour?)

Some SAS procedures will honor the format you apply (typically, when you treat time/datetime as a category variable), and some procedures won't use the format (those that treat time/datetime as continuous).

You can also create bins without using formats.

First of all, you can divide time by 60 or 3600 etc. and round the results. Or you can round to the nearest hour with the round function. Or you can use the intnx() function.

Some examples:

data time;

  set time;

  time_bin=intnx('hour',t,0,'b');

  time_bin2=round(t,3600);

  time_bin3=round(t,'1:00't);

  time_bin4=intnx('hour',t,0,'m');

run;

gergely_batho
SAS Employee

If you want to illustrate on a chart, that time is continuous through midnight then this is a (rare) case when a radar chart is adequate:

data time;

  t=1;

  do i=1 to 10000;

  t+1000*ranuni(123);

  output;

  end;

  format t tod2.;

run;

proc gradar data=time;

  chart t;

run;

If you want to identify "peak hours", "peak minutes" or something similar, then you are looking for the mode of the distribution with a "cyclical twist".

I would follow what suggests: to re-center the data.

data time2;

  set time;

  timePartOrig  =timepart(t);

  timePartReCent=timepart(t+'12:00't);

run;

proc univariate data=time2 round=60 modes;

  var timePartOrig timePartReCent;

  histogram;

run;

You can see in this example, that I am using rounding (otherwise the is no mode of the data, because in my generated data time is continuous, and no value occurs twice.). If your data is similar, other procedures might be more useful to extract "peaks". (eg: prockde)

ballardw
Super User

Do you actually get midnight values? Either 2400 or 0000? If not then there may be something fishy in collection.

Ksharp
Super User

use Kernel Density Estimate might smooth it :

proc kde   to get  Kernel Density

after that, proc sgplot  to draw Histogram .

Xia Keshan

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 9 replies
  • 1795 views
  • 7 likes
  • 6 in conversation