BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
tri_h
Fluorite | Level 6

a data like:

data _tstdata;
length Duration $ 20;
input Duration $;
datalines;
P2Y
P10W
P3M14D
P3D
P6M17DT3H
P14DT7H57M
PT42M18S
PT0.5H
P5DT12.25H
P4.5W
P1Y
P1.4Y
P1.4Y2M
P14Y2.5M
P14Y2.5MT1M
P2MT30M
P2MT30.5M
P2MT30.5M3S
PT6H
PT6.5H
PT6.5H12S
P5W
P5.2W
P5.2W1D
P3Y29DT4H35M59S
P1Y1M1DT2H12M34.23S

P
PT
P3MT
PARALLEL
;
run;

and the last 4 records are not a ISO 8601 duration format.

thanks for helping

1 ACCEPTED SOLUTION

Accepted Solutions
Patrick
Opal | Level 21

A bit more explanation wouldn't hurt. What's the desired outcome? May-be add some code that creates a Want table and some narrative explaining the logic required to get from Have to Want?

View solution in original post

7 REPLIES 7
Patrick
Opal | Level 21

A bit more explanation wouldn't hurt. What's the desired outcome? May-be add some code that creates a Want table and some narrative explaining the logic required to get from Have to Want?

tri_h
Fluorite | Level 6

SORRY, i do tried some syntax like:

data tst;
set _tstdata;
if prxmatch("/^P((\d+\.)?\d+[YMWD]$)?((\d+\.)?\d+Y$)?((\d+\.)?\d+M$)?((\d+\.)?\d+W$)?((\d+\.)?\d+D$)?(T((\d+\.)?\d+[HMS]$)((\d+\.)?\d+H$)?((\d+\.)?\d+M$)?((\d+\.)?\d+S$)?)?/", Duration) then Data_Type='durationDatetime';
 run;
even I add "and substrn(Duration, lengthn(Duration)) not in ('P', 'T')" at the if statement, but i still can't aviod the string of "P*" been recognized as the ISO 8601 DURATION FORMAT(WHICH IS WRONG) , is there any possibility that  i can use only 1 patern to recognized the ISO 8601 duration?
tri_h
Fluorite | Level 6

yes, the searching found stackflow, but their codes doesn't work at SAS prxmatch functions, when i reference there solution:

^P(?=\d+[YMWD])(\d+Y)?(\d+M)?(\d+W)?(\d+D)?(T(?=\d+[HMS])(\d+H)?(\d+M)?(\d+S)?)?$

as the pattern, i got no an empty column of data_type, i'm not familiar enough with perl regular expression~

Patrick
Opal | Level 21

@tri_h I've never worked with iso duration formats but just by searching a bit in the SAS docu there are some informat, formats and - best of all - a call routine that deal with such iso duration formats. 

I've also implemented the RegEx found under the link @Tom provided. There are some differences between the different options and some further investigation will be required which one returns the "correct" result.

Below sample code should point you into the right direction. I believe ideally you can use the SAS call routine as this will give you the duration as a count of seconds which you then could use directly in conjunction with SAS Datetime values (like: end_dttm = start_dttm+duration_in_sec;).

data have;
  infile datalines truncover;
  input iso_dur $50.;
  datalines;
P2Y
P10W
P3M14D
P3D
P6M17DT3H
P14DT7H57M
PT42M18S
PT0.5H
P5DT12.25H
P4.5W
P1Y
P1.4Y
P1.4Y2M
P14Y2.5M
P14Y2.5MT1M
P2MT30M
P2MT30.5M
P2MT30.5M3S
PT6H
PT6.5H
PT6.5H12S
P5W
P5.2W
P5.2W1D
P3Y29DT4H35M59S
P1Y1M1DT2H12M34.23S

P
PT
P3MT
PARALLEL
;

data demo;
  set have;

  inf_N8601B=input(strip(iso_dur),$N8601B.);

  inf_N8601E=input(strip(iso_dur),$N8601E.);
  inf_N8601E_fmt=put(inf_N8601E,$N8601E.);

  call is8601_convert('du', 'du', iso_dur, dur_in_sec);

  regex_match_flg=prxmatch('/^P(?=\d+[YMWD])(\d+Y)?(\d+M)?(\d+W)?(\d+D)?(T(?=\d+[HMS])(\d+H)?(\d+M)?(\d+S)?)?$/oi',strip(iso_dur));
  inf_N8601E_fmt_match_flg= substr(inf_N8601E_fmt,1,1) not in (' ','*');
  dur_in_sec_match_flg = not missing(dur_in_sec);
run;

proc print data=demo;
run;

Patrick_0-1646522510339.png

 

 

 

tri_h
Fluorite | Level 6

Patrick, Thanks

There are some mistakes in the previous understanding of ISO duration, Like "P1.4Y2M", "P2MT30.5M3S" make no sense, etc.

Patrick
Opal | Level 21

@tri_h wrote:

Patrick, Thanks

There are some mistakes in the previous understanding of ISO duration, Like "P1.4Y2M", "P2MT30.5M3S" make no sense, etc.


P1.4Y2M got by all options identified as invalid. 

P14Y2.5M though got only marked as invalid by the RegEx.

 

It looks to me like the SAS call routine does a fine job. You will just have to decide how you want to treat a missing - the call routine returns zero seconds and not missing.

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 7 replies
  • 2402 views
  • 3 likes
  • 3 in conversation