BookmarkSubscribeRSS Feed
Pdogra
Calcite | Level 5
Experts:

I am facing a situation

I have date in a format M/D/YYYY and this is a string.
m stands for single digit month (i.e. 1 instead of 01 for Jan).
d stands for single digit data (e.g. 1 instead of 01 for first day of themonth, YYYY - 4 digit year).

Is there a way I can use rxchange or prxchange and look for single digit month & day and convert them to double digit. eg

7/07/2010 to 07/07/2010 or
7/7/2010 to 07/07/2010 or
10/1/2010 to 10/01/2010

Here's what I tried:
DATA try;
INPUT dt $CHAR40.;
txt= prxparse( 's/(\d\/)/0\d\//');
text = prxchange(txt, -1, dt);
DATALINES;
7/07/2009
;

result :
dt txt text

7/07/2009 1 0\d/00\d/2009 Message was edited by: Pdogra
18 REPLIES 18
art297
Opal | Level 21
If you want to learn regular expressions, by all means, let the forum know that you are still waiting for a response. However, you indicated that you wanted to correct the situation that you had. You can do that with only using put and input. For example:

data want;
set have;
want_date=put(input(have_date,mmddyy10.),mmddyy10.);
run;

HTH,
Art
Pdogra
Calcite | Level 5
wouldn't mmddyy10 informat consistently need 2 digits for month 2 digits for day and 4 for year ?

The strings date I have are erratic mostly 1 digit for months till oct and 2 digits for oct,nov and dec and same is case for day single digit till 9th day of month and 2 digits from 10 till 30/31 day of the month.
sbb
Lapis Lazuli | Level 10 sbb
Lapis Lazuli | Level 10
As demonstrated in the SAS program below, the answer is no.

Scott Barry
SBBWorks, Inc.


5 data _null_;
6 dt = input('1/1/2010',mmddyy10.);
7 put dt= date9. dt= mmddyy10.;
8 run;

dt=01JAN2010 dt=01/01/2010
NOTE: DATA statement used (Total process time):
real time 0.00 seconds
cpu time 0.00 second
art297
Opal | Level 21
I agree with Scott. Try the following:

data have;
input have_date $10.;
format date date9.;
date=input(have_date,mmddyy10.);
cards;
7/1/2010
06/5/2009
04/05/2010
5/06/2010
;

Art
deleted_user
Not applicable
If not the most appropiate, the perl parsing solution is interesting.


DATA try;
INPUT dt $CHAR40. ;
length text $10. ;
retain txt;
if _n_ eq 1 then txt= prxparse( 's/(?<!\d)(\d)\//0$1\//');
text = prxchange(txt, -1, dt);
DATALINES;
6/07/2009
6/7/2010
12/8/2008
;

/(?<!\d)(\d)/ matches any occurrence of digit that does not follow a digit.

/(?<!\d)(\d)\// matches any occurrence of digit that does not follow a digit and that is
followed by a slash. Message was edited by: sensas
Pdogra
Calcite | Level 5
Hi Sensas,
Thanks the code works. There is bit of complexity. I have time peice attached to date as well. what peice shall I change in the code to make it work.

Data looks like:
6/7/2010 12:56:08.56
10/5/2010 5:46
10/5/2010 15:46:09

I am not an expert, but looks like I may have to write a line to handle digits before colon :



DATA try;
INPUT dt $CHAR40. ;
length text $10. ;
retain txt;
if _n_ eq 1 then txt= prxparse( 's/(?
deleted_user
Not applicable
The code works with or without time pieces because the reg expression just looks for alone digit followed by a slash.


You just have to change the text variable length :



DATA try;
INPUT dt $CHAR40. ;
length text $40. ;
retain txt;
if _n_ eq 1 then txt= prxparse( 's/(?<!\d)(\d)\//0$1\//');
text = prxchange(txt, -1, dt);
DATALINES;
6/7/2010 12:56:08.56
10/5/2010 5:46
10/5/2010 15:46:09
;
deleted_user
Not applicable
With a few changes to the reg expression you can also change time pieces :


DATA try;
INPUT dt $CHAR40. ;
length text $40. ;
retain txt;
if _n_ eq 1 then txt= prxparse( 's/(?<!\d)(\d(?:\/|:))/0$1/');
text = prxchange(txt, -1, dt);
DATALINES;
6/7/2010 12:56:08.56
10/5/2010 5:46
10/5/2010 15:46:09
;

5:46 will be changed into 05:46

A perl regexp documentation link : http://perldoc.perl.org/perlre.html
Be carefull: SAS doesn't implement perl syntaxe in its whole...
Pdogra
Calcite | Level 5
Thank you very much Sensas.

Could you please piont me to documentation to help augment my knowledge with Regex ?

Also what would be significance of ! and | in your code.
if _n_ eq 1 then txt= prxparse( 's/(?
art297
Opal | Level 21
Two last points regarding a non-PERL solution, but ones which you will need to take into account with either approach: (1) you can't format a datetime with a date format and (2) if you ultimately want to input the datetimes as SAS datetime formatted variables, they will have to be structured in a way that one of the informats can be used.

The following, I think, does that for the sample data you provided:

options datestyle=mdy;

data have;
input have_date $18.;
format date date9.;
date=datepart(input(have_date||":00:00",anydtdtm20.));
cards;
7/1/2010:12:57
06/5/2009:10:57
04/05/2010:9:48
5/06/2010:6:48
;

Art
sbb
Lapis Lazuli | Level 10 sbb
Lapis Lazuli | Level 10
Not sure the relevance, but a DATETIME variable (seconds since 1/1/1960) can be formatted to only display the date-portion, as shown below:


9 data _null_;
10 x = datetime();
11 put x= datetime7. x=datetime9.;
12 run;

x=14SEP10 x=14SEP2010
NOTE: DATA statement used (Total process time):
real time 0.00 seconds
cpu time 0.00 second


The important point to emphasize here is that you have limited sorting/display control over a character-formatted DATE and/or DATETIME character-string; also consider that it may not even be legitimate - yet another reason to run it through a SAS-standard INFORMAT rather than a band-aid approach using PERL, which frankly makes the eyes on my mouse-cursor roll back when I see the command/function syntax. The best approach here is to use the INPUT function with an appropriate SAS INFORMAT, and let SAS decide if the character-string is or is not a validate numeric DATE or DATETIME value.


Scott Barry
SBBWorks, Inc.
deleted_user
Not applicable
(?<!pattern)

A zero-width negative look-behind assertion. For example /(?<!bar)foo/ matches any occurrence of "foo" that does not follow "bar". Works only for fixed-width look-behind.


Source: perl documentation


The "!" means "not", "<" means "back" and ? means "is ?".

So you can translate (?<!pattern) by "Is the pattern not preceding ?" If it's true then you are sure "pattern" is not preceding what you're looking for.

"|" is the or operator.
In that expression : "\d(?:\/|:)", i search for a digit that would be followed imediately by \ OR :.

You can search for SAS documentation on perl expression by googleying "site:sas.com SUGI Perl"
Pdogra
Calcite | Level 5
Appreciate your help sensas.

Thanks,
PD
Pdogra
Calcite | Level 5
Hi Sensas,

I played with regex little bit, but I guess I am no way closer to your level of expertise.

I am trying 2 things
1) Look for milliseconds from the time part and remove it e.g.
12/9/2010 18:06:54.55 (remove .55)
2) Look for seconds if not found then append :00 (e.g. 12/9/2010 18:064 to 12/9/2010 18:06:00 )

Is it possible to do in the same line ? Sorry don't mean to make you do my job. I am sure by the time I wil figure this out , you would have a solution for it..

Thanks,

Here's your code:

DATA try;
INPUT dt $CHAR40. ;
length text $40. ;
retain txt;
if _n_ eq 1 then txt= prxparse( 's/(?

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 18 replies
  • 1449 views
  • 0 likes
  • 4 in conversation