BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
zimcom
Pyrite | Level 9

 

 

Hi community,

 

I have a column od datetime as below:

DFCREATE
17APR04:15:54:05
17APR04:15:56:09
17APR05:15:36:48
17APR05:15:43:17
17APR27:16:30:15
17AUG25:09:51:58
17APR10:14:04:58
17APR10:14:07:37
17APR10:14:11:16
17APR21:12:27:26
17JUN17:12:52:39
17APR19:11:55:46
17SEP28:17:28:47
17JUN01:16:23:07
17JUN01:16:24:30
17AUG08:15:59:43
17AUG08:16:00:54

 

and I would like ot extract the date part, but when I use 

 

creat_date = datepart(DFCREATE);
format creat_date is8601da.;

 

new_date
2004-04-17
2004-04-17
2005-04-17
2005-04-17
1927-04-17
2025-08-17
2010-04-17
2010-04-17
2010-04-17
2021-04-17
2017-06-17
2019-04-17

 

the converted date is just wrong, the first observation should be 2017-04-04, not 2004-04-17.

 

Any ideas?  Thank you so much 

 

 

1 ACCEPTED SOLUTION

Accepted Solutions
Patrick
Opal | Level 21

@zimcom 

Your source data is a string in the form YYMONDD:HH:MM:SS

You used informat ANYDTDTM40. to read this string and convert it to a SAS DateTime value (=numerical value; count of seconds since 1/1/1960).

 

So now by using ANYDTDTM40. SAS will try to read the text string as DDMONYY:HH:MM:SS - and because the text string (your data) also represents a valid datetime in this form SAS will use it and convert it to the number of seconds since 1/1/1960. Now you have a SAS DataTime value and whatever operations and formats you use on this value it's always against the "wrong" count of seconds.

-> your current code gets it wrong when converting the source string into a SAS DateTime value.

 

Others wrote this already but here again:

What you need to do is first to read the string into a SAS character variable and then "reshuffle" it into a form which a SAS datetime informat can interprete the way you need it. Below a code sample illustrating how this could work (for the date portion of the string).

/* read the excel sheet into SAS */
PROC IMPORT 
  OUT= source 
  DATAFILE= "c:\temp\Convert dates.xlsx" 
  DBMS=xlsx 
  REPLACE;
  SHEET="#LN00046";
  GETNAMES=YES;
RUN;

/*18DEC13:11:45:31*/
/* 
  extract the date string portion from column date_old
  and convert it to a SAS date value (number of seconds since 1/1/1960
  DT string is in the form: YYMONDD:HH:MM:SS
*/
data want;
  set source;
  length dt_string $7;
  format dt_SASDateValue is8601da.;
  dt_string       =cats(substrn(date_old,6,2),substrn(date_old,3,3),substrn(date_old,1,2));
  dt_SASDateValue =input(dt_string,?? date7.);
run;

View solution in original post

18 REPLIES 18
ChrisNZ
Tourmaline | Level 20

So you have your dates as 

yyMONdd?

That's a very unusual format.

Are you sure? The standard is ddMONyy.

There is no informat in SAS for yyMONdd, so you'll have to parse the string manually if that's the case.

Of course 4-digit years would resolve any doubt, but Y2k is so far away...

zimcom
Pyrite | Level 9

the date and time was system generated in the format of "17APR04:15:54:05".

I checked the clumn attribute, the format of variable is DATETIME. and informat is ANYDTDTM40.

I need to extarct and then convert the date portion to compare with another date (e.g., 2017-03-01).

ScottBass
Rhodochrosite | Level 12

As Chris said, that's an unusual format.

 

Regardless, I thought it might be a fun exercise to dust off my Perl regular expressions...

 

 

data have;
   length DFCREATE $30;
   input dfcreate;
   datalines;
17APR04:15:54:05
17APR04:15:56:09
17APR05:15:36:48
17APR05:15:43:17
17APR27:16:30:15
17AUG25:09:51:58
17APR10:14:04:58
17APR10:14:07:37
17APR10:14:11:16
17APR21:12:27:26
17JUN17:12:52:39
17APR19:11:55:46
17SEP28:17:28:47
17JUN01:16:23:07
17JUN01:16:24:30
17AUG08:15:59:43
17AUG08:16:00:54
;
run;

data want;
   set have;
   length date_buffer time_buffer $20;
   rx1=prxparse("s/^(\d{2})(JAN|FEB|MAR|APR|MAY|JUN|JUL|AUG|SEP|OCT|NOV|DEC)(\d{2}):(.*)/\3\2\1/io");
   rx2=prxparse("s/^(\d{2})(JAN|FEB|MAR|APR|MAY|JUN|JUL|AUG|SEP|OCT|NOV|DEC)(\d{2}):(.*)/\4/io");
   date_buffer=prxchange(rx1,1,strip(dfcreate));
   time_buffer=prxchange(rx2,1,strip(dfcreate));
   date=input(date_buffer,date7.);
   format date date9.;
run;

 


Please post your question as a self-contained data step in the form of "have" (source) and "want" (desired results).
I won't contribute to your post if I can't cut-and-paste your syntactically correct code into SAS.
zimcom
Pyrite | Level 9

@ScottBass 

 

Thank you very much, Scott

 

Your program works only if the variable is TEXT, right? But in my original dataset, the format of variable is DATETIME. and informat is ANYDTDTM40.

 

I appreciated your help!

ballardw
Super User

@zimcom wrote:

@ScottBass 

 

Thank you very much, Scott

 

Your program works only if the variable is TEXT, right? But in my original dataset, the format of variable is DATETIME. and informat is ANYDTDTM40.

 

I appreciated your help!


ANYDTDTM informat uses a lot of assumptions. In this case the assumption in a datetime appearing value is that the first value is the day of the month not the year. And again, did no one learn from Y2K that two-digit years are a bad idea?

So read the original value as text, not the likely assigned by proc import ANYDATEDTM informat and then parse as @ScottBass suggests. Or other parsing technique.

 

This example shows that ANYDTDTM does not want to see a year in the first position at all:

data junk;
   x='17APR04:15:54:05';
   y='2017APR04:15:54:05';

   datex= input(x,anydtdtm40.);
   datey= input(y,anydtdtm40.);
   format datex datey datetime20.;
run;
zimcom
Pyrite | Level 9

I have attached a sample file and I want to have another column with the new date as yyyy-mm-dd in date fomat so that I can calculate the days between 2 dates. 

ballardw
Super User

Your example file does not contain dates in Excel either, they are text. Import the field as TEXT, do not use the ANYDTDTM informat but a $20. or similar long enough. And then parse the text variable.

 

Or go back to whoever created that file and ask why they are using a date format that practically no one else uses and see if they can export the data in a more typical format such as having a 4 digit year and a month number instead of three letter abbreviation.

zimcom
Pyrite | Level 9

I used proc import to read in the csv file, after read in the date variable is date format.

The file was downloaded from the database and it seems like it is system date and time and it was created like this. 

ScottBass
Rhodochrosite | Level 12

@zimcom wrote:

I have attached a sample file and I want to have another column with the new date as yyyy-mm-dd in date fomat so that I can calculate the days between 2 dates. 


 

As long as the date is stored as a date and not text, the format used to display it is irrelevant with respect to calculating the days between 2 dates.


Please post your question as a self-contained data step in the form of "have" (source) and "want" (desired results).
I won't contribute to your post if I can't cut-and-paste your syntactically correct code into SAS.
ScottBass
Rhodochrosite | Level 12

Can you change the format to DATETIME21. or DATETIME21.2 and post the results?

 

data test;
   dt1=datetime();
   dt2=dt1;
   dt3=dt1;
   format dt1 datetime. dt2 datetime18. dt3 datetime21.2;
run;

Please post your question as a self-contained data step in the form of "have" (source) and "want" (desired results).
I won't contribute to your post if I can't cut-and-paste your syntactically correct code into SAS.
zimcom
Pyrite | Level 9
dt1dt2dt3
07FEB19:19:09:4107FEB19:19:09:4107FEB2019:19:09:40.56
ScottBass
Rhodochrosite | Level 12

I meant from your data, the sample code was just to illustrate the different results, esp. with DATETIME21.  Apologies for the confusion.


Please post your question as a self-contained data step in the form of "have" (source) and "want" (desired results).
I won't contribute to your post if I can't cut-and-paste your syntactically correct code into SAS.
zimcom
Pyrite | Level 9
usubjidaestdtcold_datenew_date
10012017-03-0217APR04:15:54:0517APR2004:15:54:05.00
10012017-03-0317APR04:15:56:0917APR2004:15:56:09.00
10022017-03-0117APR05:15:36:4817APR2005:15:36:48.00
10022017-03-0317APR05:15:43:1717APR2005:15:43:17.00
10022017-03-0417APR27:16:30:1517APR1927:16:30:15.00
10022017-03-0117AUG25:09:51:5817AUG2025:09:51:58.00

 

Thank you!

ScottBass
Rhodochrosite | Level 12

@zimcom wrote:
usubjidaestdtcold_datenew_date
10012017-03-0217APR04:15:54:0517APR2004:15:54:05.00
10012017-03-0317APR04:15:56:0917APR2004:15:56:09.00
10022017-03-0117APR05:15:36:4817APR2005:15:36:48.00
10022017-03-0317APR05:15:43:1717APR2005:15:43:17.00
10022017-03-0417APR27:16:30:1517APR1927:16:30:15.00
10022017-03-0117AUG25:09:51:5817AUG2025:09:51:58.00

 

Thank you!


 

And your original post says:

the converted date is just wrong, the first observation should be 2017-04-04, not 2004-04-17.

 


Please post your question as a self-contained data step in the form of "have" (source) and "want" (desired results).
I won't contribute to your post if I can't cut-and-paste your syntactically correct code into SAS.

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 18 replies
  • 8724 views
  • 4 likes
  • 5 in conversation