Hello everyone,
I am trying to calculate the difference between two dates which are in different format as shown below. The start date is in the format yyyymmdd and end date is in the format ddmmmyy:hh:mm:ss
Sdate | Edate |
20010130 | 01JUN01:09:01:00 |
20030524 | 04AUG03:16:10:00 |
20010421 | 01JUL01:08:15:00 |
The output dataset should show the difference between two dates in days and months. The differene should include end date in calculation. The output dataset should show as below.
NoDays |
123 |
73 |
72 |
Thank you in advance!
@danwarags wrote:
The output dataset should show the difference between two dates in days and months.
How are you defining a 'month'? Your sample shows number of days and that's easy to calculate.
Dates are stored as the number of days from Jan 1 1960. You can do math on them, ie subtract dates directly once they're SAS dates. To create SAS dates:
Convert datetime to date -> DATEPART(var)
Convert text to date -> INPUT( var, YYMMDD8.)
For each variable, what does PROC CONTENTS tell you about it: Is it numeric or character? If numeric, does it have a format?
It doesn't matter what the variable looks like when you print it. It matters what PROC CONTENTS reveals as the characteristics of the variable.
@danwarags wrote:
The output dataset should show the difference between two dates in days and months.
How are you defining a 'month'? Your sample shows number of days and that's easy to calculate.
Dates are stored as the number of days from Jan 1 1960. You can do math on them, ie subtract dates directly once they're SAS dates. To create SAS dates:
Convert datetime to date -> DATEPART(var)
Convert text to date -> INPUT( var, YYMMDD8.)
Thanks Reeza. I really appreciate your help. The code worked. Given the start_date and End_date in the SAS format, how do I calculate the difference between the earliest start_date and late end_date within each ID. For example, I have a scenario as shown below:
ID | Start_date | End_date | Ddiff |
1 | 1/1/2001 | 1/12/2001 | 17 |
1 | 1/2/2001 | 1/18/2001 | |
1 | 1/6/2001 | 1/8/2001 | |
2 | 3/4/2001 | 3/8/2001 | 4 |
3 | 2/4/2002 | 2/12/2002 | 20 |
3 | 2/10/2002 | 2/24/2002 | |
3 | 2/14/2002 | 2/18/2002 | |
4 | 3/15/2003 | 3/18/2003 | 11 |
4 | 3/20/2003 | 3/26/2003 |
I Should calculate Ddiff as mentioned in the above scenario. The Ddiff should be calculated in such a way that it should show the difference of early start_date and late end_date within each ID as shown above.
Thank you!!
data have; infile cards expandtabs truncover; input ID Start_date : mmddyy10. End_date : mmddyy10.; format Start_date End_date mmddyy10.; cards; 1 1/1/2001 1/12/2001 17 1 1/2/2001 1/18/2001 1 1/6/2001 1/8/2001 2 3/4/2001 3/8/2001 4 3 2/4/2002 2/12/2002 20 3 2/10/2002 2/24/2002 3 2/14/2002 2/18/2002 4 3/15/2003 3/18/2003 11 4 3/20/2003 3/26/2003 ; run; proc sql; create table want as select *,max(end_date)-min(start_date) as dif from have group by id; quit;
Don't miss out on SAS Innovate - Register now for the FREE Livestream!
Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.