BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
NevermoreRedres
Obsidian | Level 7

Hello everyone, hope y'all are doing good.

Over the last days i've been frustrated trying to make my research using SAS. You see, my research uses time variables on the hh:mm:ss format and FreelanceReinhard was been very helpfull teaching me how to correctly make SAS read the time valuables to put then on a table.

However, now comes the analytcs part, wich i've tried a lot of things. First lemme show you guys the code i'm using and a sample of the variables:

 

DATA analise2014;
infile cards dlm='09'x dsd truncover;
input nnecropsia (tobito telaboracaobo chegadasvoi inicionecropsia retiradafuneraria) (:time.);
format tobito--retiradafuneraria time8.;

CARDS;
2   	15:20:00	20:12:00	22:30:00	22:32:00	00:46:00
3	16:45:00	20:01:00	22:32:00	23:10:00	00:46:00
4	22:16:00	02:02:00	03:47:00	08:16:00	10:18:00
RUN;

proc print data=analise2014; run;

 

Now please understand that in only showing three samples, but my research has more than three thousant... so you know what i mean.

What i need of my variables are the following, lemme give you an exemple, i hope this will make it easier to understand:

 

    tobito            telaboracaobo          chegadasvoi        inicionecropsia     retiradafuneraria

 22:16:00             02:02:00                   03:47:00                08:16:00             10:18:00

        I                          I                                I                            I                           I

       V                         V                              V                           V                          V

Let's call this          This one                     "C"                         "D"                      "E"

variable "A"            is "B"

 

I need to know how much time has passed from A to B, from B to C, C to D and finnaly D to E, so mathematycally i guess this comes down to (B - A), (C - B), (D - C) and (E - D), i need to do this for all the collums .

Now that i "have" the knowledge of how much time passed for each sample, i need the arithmetic average of these results meaning: all of the (B - A) / X, "X" beeing the number of results i got from (B - A), a simple average, and repeat it for the other variables.

However i have two major problems: I can't find a way to make SAS make these calculations with time. When i use the code:

proc univariate plot; var tobito;
run; 

just to see how the distribuction of time is in one variable, the mean dosen't come in hh:mm:ss format, so if i tried to make the calculations, SAS wouldn't read it as the time format. I've tried to understand INTCK but i don't know if it applies to what i want and i can't find an answer on the presentation by Andrew Karp, so i'm really lost on how to do it.

This is one of the problems, the other one you might already see if you look closely on the sample i got for my "calculations example": if you make B - A (02:02:00 - 22:16:00) the number will come as a negative number because, at least what i've read about in other posts of the forum, SAS has an issue when calculating times that "tresspass" midnight, so i'm also lost on how to resolve that.

 

If any of you can teach me how to make these calculations and fix these problems, it would immensely help.

 

1 ACCEPTED SOLUTION

Accepted Solutions
mkeintz
PROC Star

All the procs will produce the same underlying value - i.e. the number of seconds representing the various means. 

 

I don't believe you can make proc univariate change the format of the printed results, even though its numeric value is correct.  You can either tell univariate to output a dataset of the means and then print them using a time. format.

 

... or consider using, smoe other proc that allows control of formats of reported statistics.  Using (say) proc tabulate, you could:

 

DATA analise2014;
infile cards truncover;
input nnecropsia  (a b c d e ) (:time.);
format a--e time8.;
CARDS;
2   15:20:00	20:12:00	22:30:00	22:32:00	00:46:00
3	16:45:00	20:01:00	22:32:00	23:10:00	00:46:00
4	22:16:00	02:02:00	03:47:00	08:16:00	10:18:00
RUN;

data need;
  set analise2014;
  a_to_b=b-a;  
  b_to_c=c-b;
  c_to_d=d-c;
  d_to_e=e-d;
  format a_to_b -- d_to_e time8. ;
run;
proc tabulate data=need noseps;
  var a_to_b -- d_to_e;
  tables  a_to_b--d_to_e , mean*f=time10.0;
run;

 

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------

View solution in original post

9 REPLIES 9
mkeintz
PROC Star

All the procs will produce the same underlying value - i.e. the number of seconds representing the various means. 

 

I don't believe you can make proc univariate change the format of the printed results, even though its numeric value is correct.  You can either tell univariate to output a dataset of the means and then print them using a time. format.

 

... or consider using, smoe other proc that allows control of formats of reported statistics.  Using (say) proc tabulate, you could:

 

DATA analise2014;
infile cards truncover;
input nnecropsia  (a b c d e ) (:time.);
format a--e time8.;
CARDS;
2   15:20:00	20:12:00	22:30:00	22:32:00	00:46:00
3	16:45:00	20:01:00	22:32:00	23:10:00	00:46:00
4	22:16:00	02:02:00	03:47:00	08:16:00	10:18:00
RUN;

data need;
  set analise2014;
  a_to_b=b-a;  
  b_to_c=c-b;
  c_to_d=d-c;
  d_to_e=e-d;
  format a_to_b -- d_to_e time8. ;
run;
proc tabulate data=need noseps;
  var a_to_b -- d_to_e;
  tables  a_to_b--d_to_e , mean*f=time10.0;
run;

 

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
NevermoreRedres
Obsidian | Level 7

Hello mkeintz! Your code worked beautifully, i just had to add the code i was using in the second line, so that SAS would understand that i was using a time format, so the final code was like this:

 

DATA analise2014;
infile cards dlm='09'x dsd truncover;
input nnecropsia  (a b c d e ) (:time.);
format a--e time8.;
CARDS;
  2   	15:20:00	20:12:00	22:30:00	22:32:00	00:46:00
 3	 16:45:00	20:01:00	22:32:00	23:10:00	00:46:00
 4	 22:16:00	02:02:00	03:47:00	08:16:00	10:18:00
RUN;

data need;
  set analise2014;
  a_to_b=b-a;  
  b_to_c=c-b;
  c_to_d=d-c;
  d_to_e=e-d;
  format a_to_b -- d_to_e time8. ;
run;
proc tabulate data=need noseps;
  var a_to_b -- d_to_e;
  tables  a_to_b--d_to_e , mean*f=time10.0;
run;

However, when i run; the second part of the code (set analise2014) it falls on that problem that i mentioned:

when SAS makes B - A in the last sample (02:02:00 - 22:16:00) it comes as -20:14 (don't know why it is this result) instead of the right result that would be 03:46:00 (22:16:00 + 3:46:00 = 02:02:00). You already helped me a lot, but do you know how i can fix this SAS problem? If not, i'm already very greatfull.

 

 

Tom
Super User Tom
Super User

Why would expect that subtracting a large number from a smaller number should not result in a negative result?

Perhaps you want to treat the differences as if they were datetime values and just extract the time part again.

844   data test;
845     from='02:00't ;
846     to='22:00't ;
847     diff = from - to;
848     diff2 = timepart(diff);
849
850     put (from to diff diff2) (=)
851       / (from to diff diff2) (=time.)
852       / (from to diff diff2) (=datetime.)
853     ;
854   run;

from=7200 to=79200 diff=-72000 diff2=14400
from=2:00:00 to=22:00:00 diff=-20:00 diff2=4:00:00
from=01JAN60:02:00:00 to=01JAN60:22:00:00 diff=31DEC59:04:00:00 diff2=01JAN60:04:00:00
NevermoreRedres
Obsidian | Level 7
Hello Tom! Yes, i know that if subtracting a large number from a smaller number results in a negative number. The answer to my question is exacly what you said, "treat the differences as if they were datetime values and just extract the time part again". Now i just have to find a way to to make this for all the samples.
Thank you so much!
mkeintz
PROC Star

If

  1. you want the amount of time from A to B
  2. Every instance in which time B is less than time A represents crossing a single midnight (i.e. B is the next day

then instead of 

a_to_b=b-a;

use

a_to_b=ifn(b>a,b-a,'24:00:00't-a+b);

The ifn function (like the excel IF function) says to use b-a if b>a.  Otherwise add b to the time from a to midnight (represented by the time literal '24:00:00't).  

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
NevermoreRedres
Obsidian | Level 7
Helo again mkeintz! That's exacly what i needed! Now the subtractions come as the correct result!
Thank you again for the help, you're all awesome!
Tom
Super User Tom
Super User

So you have times that could come from different dates and you didn't collect the date? 

Is it safe to assume that out of sequence times mean there was a change of day?  What if it is just a data entry error?

Also is it possible that the interval is more than 24 hours?  If so then the time parts might appear in order but actually mean a difference of more than 24 hours.

 

A time value is a time on 01JAN1960 when you treat it as a datetime value.  So if you want treat TIME2 < TIME1 as meaning TIME2 occurred on the next day then treat it as a time on 02JAN1960 instead of 01JAN1960.  Boolean expression return 1 when true and 0 when false.  So use the test has the DAY value in a call to the DHMS() (days, hours, minutes, seconds) function.

time_diff = dhms(time2<time1,0,0,time2)-time1 ;

 

NevermoreRedres
Obsidian | Level 7
Hello again Tom! All the dates are in sequence, for example If A is 22:16:00 01/02 and B is 02:02:00, i can safely say that the date B is in is 02/02, at least for my research, so there are no intervals with more than 24h.
Really appreciate your concern, thank you!
PGStats
Opal | Level 21

Note that proc summary output statistics inherit the formats of the input data variables. Most procedures do not have that feature.

PG

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 9 replies
  • 1133 views
  • 4 likes
  • 4 in conversation