BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
kaziumair
Quartz | Level 8

Hi , I want to extract date from the following string

"<article id="post-13669" class="post-13669 stories type-stories status-publish has-post-thumbnail hentry category-homepage-stories tag-amab tag-amabhungane tag-dewald-van-rensburg tag-factory tag-idc tag-ikra tag-insa tag-investigative-journalism tag-kzn tag-peter-maskell-auctioneers tag-pots tag-turkey-pots" link="https://amabhungane.org/stories/210312-funder-vs-funder-in-r500m-pots-fiasco/">
"

 

the date is -> 210312

 

How can I extract it?

1 ACCEPTED SOLUTION

Accepted Solutions
Kurt_Bremser
Super User

Brute force attack:

data have;
string = '<article id="post-13669" class="post-13669 stories type-stories status-publish has-post-thumbnail hentry category-homepage-stories tag-amab tag-amabhungane tag-dewald-van-rensburg tag-factory tag-idc tag-ikra tag-insa tag-investigative-journalism tag-kzn tag-peter-maskell-auctioneers tag-pots tag-turkey-pots" link="https://amabhungane.org/stories/210312-funder-vs-funder-in-r500m-pots-fiasco/">';
run;

data want;
set have;
pos = index(string,'link="https:');
substring = scan(substr(string,pos+13),-2,"/");
date = input(substr(substring,1,6),yymmdd6.);
format date yymmdd10.;
run;

Someone will come up with a clever application of PRXMATCH, I'm sure.

View solution in original post

8 REPLIES 8
Kurt_Bremser
Super User

Brute force attack:

data have;
string = '<article id="post-13669" class="post-13669 stories type-stories status-publish has-post-thumbnail hentry category-homepage-stories tag-amab tag-amabhungane tag-dewald-van-rensburg tag-factory tag-idc tag-ikra tag-insa tag-investigative-journalism tag-kzn tag-peter-maskell-auctioneers tag-pots tag-turkey-pots" link="https://amabhungane.org/stories/210312-funder-vs-funder-in-r500m-pots-fiasco/">';
run;

data want;
set have;
pos = index(string,'link="https:');
substring = scan(substr(string,pos+13),-2,"/");
date = input(substr(substring,1,6),yymmdd6.);
format date yymmdd10.;
run;

Someone will come up with a clever application of PRXMATCH, I'm sure.

kaziumair
Quartz | Level 8
Thank you for your help .
andreas_lds
Jade | Level 19

Most likely by using a regular expression. For the text you have posted the expression

\/(\d{6})\D

seems ok.

 

See docs of prxmatch, prxparse and prxposn for details.

kaziumair
Quartz | Level 8
Hi , Thanks for your suggestion , I will check the docs mentioned
PaalNavestad
Pyrite | Level 9

Hi, one way would be to use regexp to find 6 numbers in a row or maybe /followed by 6 numbers then you have your date. Further I would split the string into year, month day as numbers and use the yymmdd function. You may also try anydtdte. informat.

 

Kurt_Bremser
Super User

Don't use the ANY* informats for strings where the date structure is not very clear. With the given string, you might end up with 2012-03-21 or 2021-03-12, depending on locale of the SAS session.

Since you can only use the ANY* informats reliably when the structure is clear already, you never use them, unless you want unpredictable results.

Ksharp
Super User
data want;
set have;
p=prxmatch('/(?<=\/)\d/',string);
if p then want=substr(string,p,6);
run;

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 16. Read more here about why you should contribute and what is in it for you!

Submit your idea!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 8 replies
  • 1552 views
  • 5 likes
  • 5 in conversation