Trying to calculate the total number of days between cases of "Debridement" by study number.
Data Have:
Studynum Proc1 Procedure Date
4 Debridement 09SEP13:15:34:00
4 Debridement 12SEP13:14:45:00
4 Debridement 20SEP13:20:22:00
5 Debridement 25OCT14:13:01:00
5 Debridement 26OCT14:15:01:00
Data Want:
Studynum Total Days Between Cases of Debridement
4 5.3
5 2.1
hi @stancemcgraw Are you after this?
data have;
input Studynum Proc1 : $20. Procedure_Date :datetime20.;
format Procedure_Date datetime20.;
cards;
4 Debridement 09SEP13:15:34:00
4 DI 11SEP13:13:20:00
4 Debridement 12SEP13:14:45:00
4 Debridement 20SEP13:20:22:00
5 DI 23OCT14:23:22:00
5 Debridement 25OCT14:13:01:00
;
data want;
set have;
by Studynum;
k=datepart(Procedure_Date);
diffindays=dif(k);
if first.studynum then diffindays=0;
drop k;
run;
@stancemcgraw wrote:
Trying to calculate the mean number of days between cases of "Debridement" or "DI" (Debridement and Irrigation) by study number.
Data Have:
Studynum Proc1 Procedure Date
4 Debridement 09SEP13:15:34:00
4 DI 11SEP13:13:20:00
4 Debridement 12SEP13:14:45:00
4 Debridement 20SEP13:20:22:00
5 DI 23OCT14:23:22:00
5 Debridement 25OCT14:13:01:00
Data Want:
Studynum Mean Days Between Cases of Debridement
4 5.3
5 2.1
And how would you get 5.3 for the first? is that by computing the absolute difference of the the single DI datetime to all of the Debridement dates and then averaging?
Do any of your studynum values have more than one DI datetime? If so describe how to calculate the individual differences that would be used for a mean?
DI and Debridement will be counted as the same thing. So yes. I just put 5.3 there is an example. I don't know the actual number
Please explain the logic/math/algorithm used to determine 5.3 for studynum 4 and 2.1 for studynum 5 (neither of which are obvious to me, even in the simpler case of studynum 5)
But we don't know how we would perform this calculation for studynum 4. You have to explain how it is done.
For studynum 5, it would appear the answer is 1 day, 13 hours, 39 minutes, or the decimal equivalent. Is that correct?
I would like to create SAS code that calculates the total days between cases of debridement by studynum. I don't know how to do that.
And as @ballardw and I have stated, we don't know what that means in the case of studynum 4. We can't tell you how to do this. We are asking for an explanation of what the calculations are, not the SAS code needed. Once we understand what the calculations are, we can determine the SAS code.
I think you need to talk to some subject matter expert at your company/university who can explain how to compute the average in the case of studynum 4.
I would also like you to state that the answer I gave for studynum 5 is correct (or not).
hi @stancemcgraw Are you after this?
data have;
input Studynum Proc1 : $20. Procedure_Date :datetime20.;
format Procedure_Date datetime20.;
cards;
4 Debridement 09SEP13:15:34:00
4 DI 11SEP13:13:20:00
4 Debridement 12SEP13:14:45:00
4 Debridement 20SEP13:20:22:00
5 DI 23OCT14:23:22:00
5 Debridement 25OCT14:13:01:00
;
data want;
set have;
by Studynum;
k=datepart(Procedure_Date);
diffindays=dif(k);
if first.studynum then diffindays=0;
drop k;
run;
Similar approach to @novinosrin but maintaining the seconds for the calculation. @stancemcgraw did not provide any rules about calculating anything. So I am getting the successive difference of seconds and averaging that. I leave the (should be trivial) exercise of turning the number of seconds into "days".
Issues in defining the calculations that should be addressed:
1) which values are to be compared. One "date" with the previous only or is some date compared with multiples?
2) is the data sorted in ascending date within the studynum? If not how should negative differences be treated? Or should the data be sorted to prevent that.
3) Are the differences supposed to be based on actual time difference or on "days" between values? If "days" between values describe how to calculate a partial day. The order of operations and approach on this step will make BIG differences in interpretation of results.
data have; input Studynum Proc1 :$12. ProcedureDate :datetime18.; format ProcedureDate datetime18.; datalines; 4 Debridement 09SEP13:15:34:00 4 Debridement 12SEP13:14:45:00 4 Debridement 20SEP13:20:22:00 5 Debridement 25OCT14:13:01:00 5 Debridement 26OCT14:15:01:00 ; run; data temp; set have; by studynum; ddt = dif(proceduredate); if first.studynum then ddt=.; run; Proc summary data=temp nway; class studynum; var ddt; output out=work.summary mean=; run; /* at this time the ddt is mean of successive datetime differences IN SECONDS */ proc print data=work.summary ; var studynum ddt; format ddt time12.; run;
The time format will show the mean number of hours, minutes and seconds in this case.
@stancemcgraw wrote:
I would like to create SAS code that calculates the total days between cases of debridement by studynum. I don't know how to do that.
Let's look at Study number 4. You have 3 numbers there so what is the total days?
The technical issue here is trivial, but the logic of how it should be calculated is what we would need to know to properly answer this question.
@ all respondents,
I think OP is figuring out the requirement while we are pondering the same. Sure, our responses are gonna get the OP more confused while business experts themselves perhaps do not communicate well to the tech team. Lately, it seems the industry absolutely needs people to be familiar TECH+BUSINESS which apparently is hard to find.
Anyways, lunch break 2:00 CST
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.
Find more tutorials on the SAS Users YouTube channel.