- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hello everyone, hope y'all are doing good.
Over the last days i've been frustrated trying to make my research using SAS. You see, my research uses time variables on the hh:mm:ss format and FreelanceReinhard was been very helpfull teaching me how to correctly make SAS read the time valuables to put then on a table.
However, now comes the analytcs part, wich i've tried a lot of things. First lemme show you guys the code i'm using and a sample of the variables:
DATA analise2014; infile cards dlm='09'x dsd truncover; input nnecropsia (tobito telaboracaobo chegadasvoi inicionecropsia retiradafuneraria) (:time.); format tobito--retiradafuneraria time8.; CARDS; 2 15:20:00 20:12:00 22:30:00 22:32:00 00:46:00 3 16:45:00 20:01:00 22:32:00 23:10:00 00:46:00 4 22:16:00 02:02:00 03:47:00 08:16:00 10:18:00 RUN; proc print data=analise2014; run;
Now please understand that in only showing three samples, but my research has more than three thousant... so you know what i mean.
What i need of my variables are the following, lemme give you an exemple, i hope this will make it easier to understand:
tobito telaboracaobo chegadasvoi inicionecropsia retiradafuneraria
22:16:00 02:02:00 03:47:00 08:16:00 10:18:00
I I I I I
V V V V V
Let's call this This one "C" "D" "E"
variable "A" is "B"
I need to know how much time has passed from A to B, from B to C, C to D and finnaly D to E, so mathematycally i guess this comes down to (B - A), (C - B), (D - C) and (E - D), i need to do this for all the collums .
Now that i "have" the knowledge of how much time passed for each sample, i need the arithmetic average of these results meaning: all of the (B - A) / X, "X" beeing the number of results i got from (B - A), a simple average, and repeat it for the other variables.
However i have two major problems: I can't find a way to make SAS make these calculations with time. When i use the code:
proc univariate plot; var tobito; run;
just to see how the distribuction of time is in one variable, the mean dosen't come in hh:mm:ss format, so if i tried to make the calculations, SAS wouldn't read it as the time format. I've tried to understand INTCK but i don't know if it applies to what i want and i can't find an answer on the presentation by Andrew Karp, so i'm really lost on how to do it.
This is one of the problems, the other one you might already see if you look closely on the sample i got for my "calculations example": if you make B - A (02:02:00 - 22:16:00) the number will come as a negative number because, at least what i've read about in other posts of the forum, SAS has an issue when calculating times that "tresspass" midnight, so i'm also lost on how to resolve that.
If any of you can teach me how to make these calculations and fix these problems, it would immensely help.
Accepted Solutions
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
All the procs will produce the same underlying value - i.e. the number of seconds representing the various means.
I don't believe you can make proc univariate change the format of the printed results, even though its numeric value is correct. You can either tell univariate to output a dataset of the means and then print them using a time. format.
... or consider using, smoe other proc that allows control of formats of reported statistics. Using (say) proc tabulate, you could:
DATA analise2014;
infile cards truncover;
input nnecropsia (a b c d e ) (:time.);
format a--e time8.;
CARDS;
2 15:20:00 20:12:00 22:30:00 22:32:00 00:46:00
3 16:45:00 20:01:00 22:32:00 23:10:00 00:46:00
4 22:16:00 02:02:00 03:47:00 08:16:00 10:18:00
RUN;
data need;
set analise2014;
a_to_b=b-a;
b_to_c=c-b;
c_to_d=d-c;
d_to_e=e-d;
format a_to_b -- d_to_e time8. ;
run;
proc tabulate data=need noseps;
var a_to_b -- d_to_e;
tables a_to_b--d_to_e , mean*f=time10.0;
run;
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set
Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets
--------------------------
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
All the procs will produce the same underlying value - i.e. the number of seconds representing the various means.
I don't believe you can make proc univariate change the format of the printed results, even though its numeric value is correct. You can either tell univariate to output a dataset of the means and then print them using a time. format.
... or consider using, smoe other proc that allows control of formats of reported statistics. Using (say) proc tabulate, you could:
DATA analise2014;
infile cards truncover;
input nnecropsia (a b c d e ) (:time.);
format a--e time8.;
CARDS;
2 15:20:00 20:12:00 22:30:00 22:32:00 00:46:00
3 16:45:00 20:01:00 22:32:00 23:10:00 00:46:00
4 22:16:00 02:02:00 03:47:00 08:16:00 10:18:00
RUN;
data need;
set analise2014;
a_to_b=b-a;
b_to_c=c-b;
c_to_d=d-c;
d_to_e=e-d;
format a_to_b -- d_to_e time8. ;
run;
proc tabulate data=need noseps;
var a_to_b -- d_to_e;
tables a_to_b--d_to_e , mean*f=time10.0;
run;
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set
Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets
--------------------------
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hello mkeintz! Your code worked beautifully, i just had to add the code i was using in the second line, so that SAS would understand that i was using a time format, so the final code was like this:
DATA analise2014; infile cards dlm='09'x dsd truncover; input nnecropsia (a b c d e ) (:time.); format a--e time8.; CARDS; 2 15:20:00 20:12:00 22:30:00 22:32:00 00:46:00 3 16:45:00 20:01:00 22:32:00 23:10:00 00:46:00 4 22:16:00 02:02:00 03:47:00 08:16:00 10:18:00 RUN; data need; set analise2014; a_to_b=b-a; b_to_c=c-b; c_to_d=d-c; d_to_e=e-d; format a_to_b -- d_to_e time8. ; run; proc tabulate data=need noseps; var a_to_b -- d_to_e; tables a_to_b--d_to_e , mean*f=time10.0; run;
However, when i run; the second part of the code (set analise2014) it falls on that problem that i mentioned:
when SAS makes B - A in the last sample (02:02:00 - 22:16:00) it comes as -20:14 (don't know why it is this result) instead of the right result that would be 03:46:00 (22:16:00 + 3:46:00 = 02:02:00). You already helped me a lot, but do you know how i can fix this SAS problem? If not, i'm already very greatfull.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Why would expect that subtracting a large number from a smaller number should not result in a negative result?
Perhaps you want to treat the differences as if they were datetime values and just extract the time part again.
844 data test; 845 from='02:00't ; 846 to='22:00't ; 847 diff = from - to; 848 diff2 = timepart(diff); 849 850 put (from to diff diff2) (=) 851 / (from to diff diff2) (=time.) 852 / (from to diff diff2) (=datetime.) 853 ; 854 run; from=7200 to=79200 diff=-72000 diff2=14400 from=2:00:00 to=22:00:00 diff=-20:00 diff2=4:00:00 from=01JAN60:02:00:00 to=01JAN60:22:00:00 diff=31DEC59:04:00:00 diff2=01JAN60:04:00:00
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Thank you so much!
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
If
- you want the amount of time from A to B
- Every instance in which time B is less than time A represents crossing a single midnight (i.e. B is the next day
then instead of
a_to_b=b-a;
use
a_to_b=ifn(b>a,b-a,'24:00:00't-a+b);
The ifn function (like the excel IF function) says to use b-a if b>a. Otherwise add b to the time from a to midnight (represented by the time literal '24:00:00't).
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set
Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets
--------------------------
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Thank you again for the help, you're all awesome!
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
So you have times that could come from different dates and you didn't collect the date?
Is it safe to assume that out of sequence times mean there was a change of day? What if it is just a data entry error?
Also is it possible that the interval is more than 24 hours? If so then the time parts might appear in order but actually mean a difference of more than 24 hours.
A time value is a time on 01JAN1960 when you treat it as a datetime value. So if you want treat TIME2 < TIME1 as meaning TIME2 occurred on the next day then treat it as a time on 02JAN1960 instead of 01JAN1960. Boolean expression return 1 when true and 0 when false. So use the test has the DAY value in a call to the DHMS() (days, hours, minutes, seconds) function.
time_diff = dhms(time2<time1,0,0,time2)-time1 ;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Really appreciate your concern, thank you!
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Note that proc summary output statistics inherit the formats of the input data variables. Most procedures do not have that feature.