09-28-2016 07:45 PM
I need to calculate difference of two dates in different format.
At time point 1 (earlier date)
has 3 variables "B29" for month, of type "Number", Length 8, Format Best12., Informat F12., which shows value in the form of "1" through "12", representing month: January through December.
"B29_B" for day, of type "Number", Length 8, Format Best12., Informat F12., which shows value in the form of "1" through "31", representing days.
"B29_C" for year, of type "Number", Length 8, Format Best12., Informat F12., which shows value in the form of "1990" through "2016", representing years.
At time point 2 (later date)
has just 1 variable "ENDDATE", type "Text", Length 48, Format $F48., Informat $48., which shows value in the form of "20160125", for example.
So to combine the 3 variables at time point 1 to be 1 variable in the same form of the variable "ENDDATE", which SAS function should be the best? Or should they (both time point 1 and time point 2 variables) be both converted to a third format?
Will the codes below be good? Or needs to be modified or changed to a better approach?
format date mmddyy10.;
To calculate difference of two columns of dates after conversion in the same format, which SAS function should be the best? I found DATDIF and INTCK, but don't know which or if there's a better way to proceed, based on the ideal date format to be chosen in my question 1 above.
If I try to adapt the codes below, should the changes be just the variable names?
09-28-2016 08:14 PM - edited 09-28-2016 08:15 PM
SAS dates are stored in numeric variables as the count of days since 01jan1960. The format applied to this numeric variable then determines how this number gets printed (it doesn't change the internal values).
So first thing you want to do, is to convert text strings representing a date to a SAS date numeric value stored in a numeric variable.
You can use an input statement with the appropriate informat for this, or there is also an mdy() function you can use.
Once you've got both dates represented as SAS dates (so number of days since 1/1/1960) you can actually simply substract the values from each other to get the days. You don't need any function. Having said that: Both functions you've used in your sample code would work so no harm to take this approach.
09-29-2016 12:14 AM - edited 09-29-2016 12:17 AM
As @Patrick explained, convert to SAS dates and use either function. An example:
data _null_; B29 = 5; B29_B = 16; B29_C = 1998; ENDDATE = "20160125"; date1 = input(catx("-", B29_C, B29, B29_B), anydtdte.); date2 = input(ENDDATE, anydtdte.); days = intck("DAY", date1, date2); put date1 date9. / date2 date9. / days=; run;
16MAY1998 25JAN2016 days=6463
10-03-2016 09:07 PM
Thank you all for the reply.
I used some similar codes (see below) which work fine for the cases when B29,B29_B,B29_C are all non-missing. However, when only year is available (case 1), or only year and month are available (case 2), the codes will just skip these and do not calculate the difference in days.
My goal actually is to classify the difference in days into 2 groups (Yes and No):
Yes for 0 =< diffdays <=30
No for diffdays >30
So I modified the codes by adding the lines in bold to find the difference in year, but that will not resolve the whole issue for both cases, and needs to be further investigated.
Also, if I just want to do a count for Case 1 (the cases when only year is available), and also a count for Case 2 (the cases when only year and month are available), how do I do that? proc freq, table, where=? Based on the counts/frequency, we can access and decide how to treat the cases with the missing days/months.
if f_B29 <0 then f_B29n = . ;
else f_B29n = f_B29 ;
*format f_B29n f_B29n. ;
label f_B29n = "Month";
if f_B29_B <0 then f_B29_Bn = . ;
else f_B29_Bn = f_B29_B ;
*format f_B29_Bn f_B29_Bn. ;
label f_B29_Bn = "Day";
if f_B29_C <0 then f_B29_Cn = . ;
else f_B29_Cn = f_B29_C ;
*format f_B29_Cn f_B29_Cn. ;
label f_B29_Cn = "Year";
format f_enddaten f_date mmddyy10.;
f_diffyears= year(f_enddaten)- f_B29_Cn;
if f_diffdays <0 then f_diffdaysn = .;
else if 0 <= f_diffdays <= 30 then f_diffdaysn = 1;
else if f_diffdays >30 then f_diffdaysn = 2;
label f_diffdaysn = "Last smoked or had a puff on a cigarette within the past 30 days" ;
format f_diffdaysn YN.;