BookmarkSubscribeRSS Feed
Pdogra
Calcite | Level 5
Experts:

I am facing a situation

I have date in a format M/D/YYYY and this is a string.
m stands for single digit month (i.e. 1 instead of 01 for Jan).
d stands for single digit data (e.g. 1 instead of 01 for first day of themonth, YYYY - 4 digit year).

Is there a way I can use rxchange or prxchange and look for single digit month & day and convert them to double digit. eg

7/07/2010 to 07/07/2010 or
7/7/2010 to 07/07/2010 or
10/1/2010 to 10/01/2010

Here's what I tried:
DATA try;
INPUT dt $CHAR40.;
txt= prxparse( 's/(\d\/)/0\d\//');
text = prxchange(txt, -1, dt);
DATALINES;
7/07/2009
;

result :
dt txt text

7/07/2009 1 0\d/00\d/2009 Message was edited by: Pdogra
18 REPLIES 18
art297
Opal | Level 21
If you want to learn regular expressions, by all means, let the forum know that you are still waiting for a response. However, you indicated that you wanted to correct the situation that you had. You can do that with only using put and input. For example:

data want;
set have;
want_date=put(input(have_date,mmddyy10.),mmddyy10.);
run;

HTH,
Art
Pdogra
Calcite | Level 5
wouldn't mmddyy10 informat consistently need 2 digits for month 2 digits for day and 4 for year ?

The strings date I have are erratic mostly 1 digit for months till oct and 2 digits for oct,nov and dec and same is case for day single digit till 9th day of month and 2 digits from 10 till 30/31 day of the month.
sbb
Lapis Lazuli | Level 10 sbb
Lapis Lazuli | Level 10
As demonstrated in the SAS program below, the answer is no.

Scott Barry
SBBWorks, Inc.


5 data _null_;
6 dt = input('1/1/2010',mmddyy10.);
7 put dt= date9. dt= mmddyy10.;
8 run;

dt=01JAN2010 dt=01/01/2010
NOTE: DATA statement used (Total process time):
real time 0.00 seconds
cpu time 0.00 second
art297
Opal | Level 21
I agree with Scott. Try the following:

data have;
input have_date $10.;
format date date9.;
date=input(have_date,mmddyy10.);
cards;
7/1/2010
06/5/2009
04/05/2010
5/06/2010
;

Art
deleted_user
Not applicable
If not the most appropiate, the perl parsing solution is interesting.


DATA try;
INPUT dt $CHAR40. ;
length text $10. ;
retain txt;
if _n_ eq 1 then txt= prxparse( 's/(?<!\d)(\d)\//0$1\//');
text = prxchange(txt, -1, dt);
DATALINES;
6/07/2009
6/7/2010
12/8/2008
;

/(?<!\d)(\d)/ matches any occurrence of digit that does not follow a digit.

/(?<!\d)(\d)\// matches any occurrence of digit that does not follow a digit and that is
followed by a slash. Message was edited by: sensas
Pdogra
Calcite | Level 5
Hi Sensas,
Thanks the code works. There is bit of complexity. I have time peice attached to date as well. what peice shall I change in the code to make it work.

Data looks like:
6/7/2010 12:56:08.56
10/5/2010 5:46
10/5/2010 15:46:09

I am not an expert, but looks like I may have to write a line to handle digits before colon :



DATA try;
INPUT dt $CHAR40. ;
length text $10. ;
retain txt;
if _n_ eq 1 then txt= prxparse( 's/(?
deleted_user
Not applicable
The code works with or without time pieces because the reg expression just looks for alone digit followed by a slash.


You just have to change the text variable length :



DATA try;
INPUT dt $CHAR40. ;
length text $40. ;
retain txt;
if _n_ eq 1 then txt= prxparse( 's/(?<!\d)(\d)\//0$1\//');
text = prxchange(txt, -1, dt);
DATALINES;
6/7/2010 12:56:08.56
10/5/2010 5:46
10/5/2010 15:46:09
;
deleted_user
Not applicable
With a few changes to the reg expression you can also change time pieces :


DATA try;
INPUT dt $CHAR40. ;
length text $40. ;
retain txt;
if _n_ eq 1 then txt= prxparse( 's/(?<!\d)(\d(?:\/|:))/0$1/');
text = prxchange(txt, -1, dt);
DATALINES;
6/7/2010 12:56:08.56
10/5/2010 5:46
10/5/2010 15:46:09
;

5:46 will be changed into 05:46

A perl regexp documentation link : http://perldoc.perl.org/perlre.html
Be carefull: SAS doesn't implement perl syntaxe in its whole...
Pdogra
Calcite | Level 5
Thank you very much Sensas.

Could you please piont me to documentation to help augment my knowledge with Regex ?

Also what would be significance of ! and | in your code.
if _n_ eq 1 then txt= prxparse( 's/(?
art297
Opal | Level 21
Two last points regarding a non-PERL solution, but ones which you will need to take into account with either approach: (1) you can't format a datetime with a date format and (2) if you ultimately want to input the datetimes as SAS datetime formatted variables, they will have to be structured in a way that one of the informats can be used.

The following, I think, does that for the sample data you provided:

options datestyle=mdy;

data have;
input have_date $18.;
format date date9.;
date=datepart(input(have_date||":00:00",anydtdtm20.));
cards;
7/1/2010:12:57
06/5/2009:10:57
04/05/2010:9:48
5/06/2010:6:48
;

Art
sbb
Lapis Lazuli | Level 10 sbb
Lapis Lazuli | Level 10
Not sure the relevance, but a DATETIME variable (seconds since 1/1/1960) can be formatted to only display the date-portion, as shown below:


9 data _null_;
10 x = datetime();
11 put x= datetime7. x=datetime9.;
12 run;

x=14SEP10 x=14SEP2010
NOTE: DATA statement used (Total process time):
real time 0.00 seconds
cpu time 0.00 second


The important point to emphasize here is that you have limited sorting/display control over a character-formatted DATE and/or DATETIME character-string; also consider that it may not even be legitimate - yet another reason to run it through a SAS-standard INFORMAT rather than a band-aid approach using PERL, which frankly makes the eyes on my mouse-cursor roll back when I see the command/function syntax. The best approach here is to use the INPUT function with an appropriate SAS INFORMAT, and let SAS decide if the character-string is or is not a validate numeric DATE or DATETIME value.


Scott Barry
SBBWorks, Inc.
deleted_user
Not applicable
(?<!pattern)

A zero-width negative look-behind assertion. For example /(?<!bar)foo/ matches any occurrence of "foo" that does not follow "bar". Works only for fixed-width look-behind.


Source: perl documentation


The "!" means "not", "<" means "back" and ? means "is ?".

So you can translate (?<!pattern) by "Is the pattern not preceding ?" If it's true then you are sure "pattern" is not preceding what you're looking for.

"|" is the or operator.
In that expression : "\d(?:\/|:)", i search for a digit that would be followed imediately by \ OR :.

You can search for SAS documentation on perl expression by googleying "site:sas.com SUGI Perl"
Pdogra
Calcite | Level 5
Appreciate your help sensas.

Thanks,
PD
Pdogra
Calcite | Level 5
Hi Sensas,

I played with regex little bit, but I guess I am no way closer to your level of expertise.

I am trying 2 things
1) Look for milliseconds from the time part and remove it e.g.
12/9/2010 18:06:54.55 (remove .55)
2) Look for seconds if not found then append :00 (e.g. 12/9/2010 18:064 to 12/9/2010 18:06:00 )

Is it possible to do in the same line ? Sorry don't mean to make you do my job. I am sure by the time I wil figure this out , you would have a solution for it..

Thanks,

Here's your code:

DATA try;
INPUT dt $CHAR40. ;
length text $40. ;
retain txt;
if _n_ eq 1 then txt= prxparse( 's/(?

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 18 replies
  • 2074 views
  • 0 likes
  • 4 in conversation