Hi community,
I have a column od datetime as below:
DFCREATE |
17APR04:15:54:05 |
17APR04:15:56:09 |
17APR05:15:36:48 |
17APR05:15:43:17 |
17APR27:16:30:15 |
17AUG25:09:51:58 |
17APR10:14:04:58 |
17APR10:14:07:37 |
17APR10:14:11:16 |
17APR21:12:27:26 |
17JUN17:12:52:39 |
17APR19:11:55:46 |
17SEP28:17:28:47 |
17JUN01:16:23:07 |
17JUN01:16:24:30 |
17AUG08:15:59:43 |
17AUG08:16:00:54 |
and I would like ot extract the date part, but when I use
creat_date = datepart(DFCREATE);
format creat_date is8601da.;
new_date |
2004-04-17 |
2004-04-17 |
2005-04-17 |
2005-04-17 |
1927-04-17 |
2025-08-17 |
2010-04-17 |
2010-04-17 |
2010-04-17 |
2021-04-17 |
2017-06-17 |
2019-04-17 |
the converted date is just wrong, the first observation should be 2017-04-04, not 2004-04-17.
Any ideas? Thank you so much
.
Your source data is a string in the form YYMONDD:HH:MM:SS
You used informat ANYDTDTM40. to read this string and convert it to a SAS DateTime value (=numerical value; count of seconds since 1/1/1960).
So now by using ANYDTDTM40. SAS will try to read the text string as DDMONYY:HH:MM:SS - and because the text string (your data) also represents a valid datetime in this form SAS will use it and convert it to the number of seconds since 1/1/1960. Now you have a SAS DataTime value and whatever operations and formats you use on this value it's always against the "wrong" count of seconds.
-> your current code gets it wrong when converting the source string into a SAS DateTime value.
Others wrote this already but here again:
What you need to do is first to read the string into a SAS character variable and then "reshuffle" it into a form which a SAS datetime informat can interprete the way you need it. Below a code sample illustrating how this could work (for the date portion of the string).
/* read the excel sheet into SAS */
PROC IMPORT
OUT= source
DATAFILE= "c:\temp\Convert dates.xlsx"
DBMS=xlsx
REPLACE;
SHEET="#LN00046";
GETNAMES=YES;
RUN;
/*18DEC13:11:45:31*/
/*
extract the date string portion from column date_old
and convert it to a SAS date value (number of seconds since 1/1/1960
DT string is in the form: YYMONDD:HH:MM:SS
*/
data want;
set source;
length dt_string $7;
format dt_SASDateValue is8601da.;
dt_string =cats(substrn(date_old,6,2),substrn(date_old,3,3),substrn(date_old,1,2));
dt_SASDateValue =input(dt_string,?? date7.);
run;
So you have your dates as
yyMONdd?
That's a very unusual format.
Are you sure? The standard is ddMONyy.
There is no informat in SAS for yyMONdd, so you'll have to parse the string manually if that's the case.
Of course 4-digit years would resolve any doubt, but Y2k is so far away...
the date and time was system generated in the format of "17APR04:15:54:05".
I checked the clumn attribute, the format of variable is DATETIME. and informat is ANYDTDTM40.
I need to extarct and then convert the date portion to compare with another date (e.g., 2017-03-01).
As Chris said, that's an unusual format.
Regardless, I thought it might be a fun exercise to dust off my Perl regular expressions...
data have;
length DFCREATE $30;
input dfcreate;
datalines;
17APR04:15:54:05
17APR04:15:56:09
17APR05:15:36:48
17APR05:15:43:17
17APR27:16:30:15
17AUG25:09:51:58
17APR10:14:04:58
17APR10:14:07:37
17APR10:14:11:16
17APR21:12:27:26
17JUN17:12:52:39
17APR19:11:55:46
17SEP28:17:28:47
17JUN01:16:23:07
17JUN01:16:24:30
17AUG08:15:59:43
17AUG08:16:00:54
;
run;
data want;
set have;
length date_buffer time_buffer $20;
rx1=prxparse("s/^(\d{2})(JAN|FEB|MAR|APR|MAY|JUN|JUL|AUG|SEP|OCT|NOV|DEC)(\d{2}):(.*)/\3\2\1/io");
rx2=prxparse("s/^(\d{2})(JAN|FEB|MAR|APR|MAY|JUN|JUL|AUG|SEP|OCT|NOV|DEC)(\d{2}):(.*)/\4/io");
date_buffer=prxchange(rx1,1,strip(dfcreate));
time_buffer=prxchange(rx2,1,strip(dfcreate));
date=input(date_buffer,date7.);
format date date9.;
run;
Thank you very much, Scott
Your program works only if the variable is TEXT, right? But in my original dataset, the format of variable is DATETIME. and informat is ANYDTDTM40.
I appreciated your help!
@zimcom wrote:
Thank you very much, Scott
Your program works only if the variable is TEXT, right? But in my original dataset, the format of variable is DATETIME. and informat is ANYDTDTM40.
I appreciated your help!
ANYDTDTM informat uses a lot of assumptions. In this case the assumption in a datetime appearing value is that the first value is the day of the month not the year. And again, did no one learn from Y2K that two-digit years are a bad idea?
So read the original value as text, not the likely assigned by proc import ANYDATEDTM informat and then parse as @ScottBass suggests. Or other parsing technique.
This example shows that ANYDTDTM does not want to see a year in the first position at all:
data junk; x='17APR04:15:54:05'; y='2017APR04:15:54:05'; datex= input(x,anydtdtm40.); datey= input(y,anydtdtm40.); format datex datey datetime20.; run;
I have attached a sample file and I want to have another column with the new date as yyyy-mm-dd in date fomat so that I can calculate the days between 2 dates.
Your example file does not contain dates in Excel either, they are text. Import the field as TEXT, do not use the ANYDTDTM informat but a $20. or similar long enough. And then parse the text variable.
Or go back to whoever created that file and ask why they are using a date format that practically no one else uses and see if they can export the data in a more typical format such as having a 4 digit year and a month number instead of three letter abbreviation.
I used proc import to read in the csv file, after read in the date variable is date format.
The file was downloaded from the database and it seems like it is system date and time and it was created like this.
@zimcom wrote:I have attached a sample file and I want to have another column with the new date as yyyy-mm-dd in date fomat so that I can calculate the days between 2 dates.
As long as the date is stored as a date and not text, the format used to display it is irrelevant with respect to calculating the days between 2 dates.
Can you change the format to DATETIME21. or DATETIME21.2 and post the results?
data test;
dt1=datetime();
dt2=dt1;
dt3=dt1;
format dt1 datetime. dt2 datetime18. dt3 datetime21.2;
run;
dt1 | dt2 | dt3 |
07FEB19:19:09:41 | 07FEB19:19:09:41 | 07FEB2019:19:09:40.56 |
I meant from your data, the sample code was just to illustrate the different results, esp. with DATETIME21. Apologies for the confusion.
usubjid | aestdtc | old_date | new_date |
1001 | 2017-03-02 | 17APR04:15:54:05 | 17APR2004:15:54:05.00 |
1001 | 2017-03-03 | 17APR04:15:56:09 | 17APR2004:15:56:09.00 |
1002 | 2017-03-01 | 17APR05:15:36:48 | 17APR2005:15:36:48.00 |
1002 | 2017-03-03 | 17APR05:15:43:17 | 17APR2005:15:43:17.00 |
1002 | 2017-03-04 | 17APR27:16:30:15 | 17APR1927:16:30:15.00 |
1002 | 2017-03-01 | 17AUG25:09:51:58 | 17AUG2025:09:51:58.00 |
Thank you!
@zimcom wrote:
usubjid aestdtc old_date new_date 1001 2017-03-02 17APR04:15:54:05 17APR2004:15:54:05.00 1001 2017-03-03 17APR04:15:56:09 17APR2004:15:56:09.00 1002 2017-03-01 17APR05:15:36:48 17APR2005:15:36:48.00 1002 2017-03-03 17APR05:15:43:17 17APR2005:15:43:17.00 1002 2017-03-04 17APR27:16:30:15 17APR1927:16:30:15.00 1002 2017-03-01 17AUG25:09:51:58 17AUG2025:09:51:58.00
Thank you!
And your original post says:
the converted date is just wrong, the first observation should be 2017-04-04, not 2004-04-17.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.