- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
a data like:
data _tstdata;
length Duration $ 20;
input Duration $;
datalines;
P2Y
P10W
P3M14D
P3D
P6M17DT3H
P14DT7H57M
PT42M18S
PT0.5H
P5DT12.25H
P4.5W
P1Y
P1.4Y
P1.4Y2M
P14Y2.5M
P14Y2.5MT1M
P2MT30M
P2MT30.5M
P2MT30.5M3S
PT6H
PT6.5H
PT6.5H12S
P5W
P5.2W
P5.2W1D
P3Y29DT4H35M59S
P1Y1M1DT2H12M34.23S
P
PT
P3MT
PARALLEL
;
run;
and the last 4 records are not a ISO 8601 duration format.
thanks for helping
Accepted Solutions
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
A bit more explanation wouldn't hurt. What's the desired outcome? May-be add some code that creates a Want table and some narrative explaining the logic required to get from Have to Want?
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
A bit more explanation wouldn't hurt. What's the desired outcome? May-be add some code that creates a Want table and some narrative explaining the logic required to get from Have to Want?
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
SORRY, i do tried some syntax like:
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Did you try to adapt the marked answer from this SO question?
https://stackoverflow.com/questions/32044846/regex-for-iso-8601-durations
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
yes, the searching found stackflow, but their codes doesn't work at SAS prxmatch functions, when i reference there solution:
^P(?=\d+[YMWD])(\d+Y)?(\d+M)?(\d+W)?(\d+D)?(T(?=\d+[HMS])(\d+H)?(\d+M)?(\d+S)?)?$
as the pattern, i got no an empty column of data_type, i'm not familiar enough with perl regular expression~
- Tags:
- Yes
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
@tri_h I've never worked with iso duration formats but just by searching a bit in the SAS docu there are some informat, formats and - best of all - a call routine that deal with such iso duration formats.
I've also implemented the RegEx found under the link @Tom provided. There are some differences between the different options and some further investigation will be required which one returns the "correct" result.
Below sample code should point you into the right direction. I believe ideally you can use the SAS call routine as this will give you the duration as a count of seconds which you then could use directly in conjunction with SAS Datetime values (like: end_dttm = start_dttm+duration_in_sec;).
data have;
infile datalines truncover;
input iso_dur $50.;
datalines;
P2Y
P10W
P3M14D
P3D
P6M17DT3H
P14DT7H57M
PT42M18S
PT0.5H
P5DT12.25H
P4.5W
P1Y
P1.4Y
P1.4Y2M
P14Y2.5M
P14Y2.5MT1M
P2MT30M
P2MT30.5M
P2MT30.5M3S
PT6H
PT6.5H
PT6.5H12S
P5W
P5.2W
P5.2W1D
P3Y29DT4H35M59S
P1Y1M1DT2H12M34.23S
P
PT
P3MT
PARALLEL
;
data demo;
set have;
inf_N8601B=input(strip(iso_dur),$N8601B.);
inf_N8601E=input(strip(iso_dur),$N8601E.);
inf_N8601E_fmt=put(inf_N8601E,$N8601E.);
call is8601_convert('du', 'du', iso_dur, dur_in_sec);
regex_match_flg=prxmatch('/^P(?=\d+[YMWD])(\d+Y)?(\d+M)?(\d+W)?(\d+D)?(T(?=\d+[HMS])(\d+H)?(\d+M)?(\d+S)?)?$/oi',strip(iso_dur));
inf_N8601E_fmt_match_flg= substr(inf_N8601E_fmt,1,1) not in (' ','*');
dur_in_sec_match_flg = not missing(dur_in_sec);
run;
proc print data=demo;
run;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Patrick, Thanks
There are some mistakes in the previous understanding of ISO duration, Like "P1.4Y2M", "P2MT30.5M3S" make no sense, etc.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
@tri_h wrote:
Patrick, Thanks
There are some mistakes in the previous understanding of ISO duration, Like "P1.4Y2M", "P2MT30.5M3S" make no sense, etc.
P1.4Y2M got by all options identified as invalid.
P14Y2.5M though got only marked as invalid by the RegEx.
It looks to me like the SAS call routine does a fine job. You will just have to decide how you want to treat a missing - the call routine returns zero seconds and not missing.