Hi , I want to extract date from the following string
"<article id="post-13669" class="post-13669 stories type-stories status-publish has-post-thumbnail hentry category-homepage-stories tag-amab tag-amabhungane tag-dewald-van-rensburg tag-factory tag-idc tag-ikra tag-insa tag-investigative-journalism tag-kzn tag-peter-maskell-auctioneers tag-pots tag-turkey-pots" link="https://amabhungane.org/stories/210312-funder-vs-funder-in-r500m-pots-fiasco/">
"
the date is -> 210312
How can I extract it?
Brute force attack:
data have;
string = '<article id="post-13669" class="post-13669 stories type-stories status-publish has-post-thumbnail hentry category-homepage-stories tag-amab tag-amabhungane tag-dewald-van-rensburg tag-factory tag-idc tag-ikra tag-insa tag-investigative-journalism tag-kzn tag-peter-maskell-auctioneers tag-pots tag-turkey-pots" link="https://amabhungane.org/stories/210312-funder-vs-funder-in-r500m-pots-fiasco/">';
run;
data want;
set have;
pos = index(string,'link="https:');
substring = scan(substr(string,pos+13),-2,"/");
date = input(substr(substring,1,6),yymmdd6.);
format date yymmdd10.;
run;
Someone will come up with a clever application of PRXMATCH, I'm sure.
Brute force attack:
data have;
string = '<article id="post-13669" class="post-13669 stories type-stories status-publish has-post-thumbnail hentry category-homepage-stories tag-amab tag-amabhungane tag-dewald-van-rensburg tag-factory tag-idc tag-ikra tag-insa tag-investigative-journalism tag-kzn tag-peter-maskell-auctioneers tag-pots tag-turkey-pots" link="https://amabhungane.org/stories/210312-funder-vs-funder-in-r500m-pots-fiasco/">';
run;
data want;
set have;
pos = index(string,'link="https:');
substring = scan(substr(string,pos+13),-2,"/");
date = input(substr(substring,1,6),yymmdd6.);
format date yymmdd10.;
run;
Someone will come up with a clever application of PRXMATCH, I'm sure.
Most likely by using a regular expression. For the text you have posted the expression
\/(\d{6})\D
seems ok.
See docs of prxmatch, prxparse and prxposn for details.
Hi, one way would be to use regexp to find 6 numbers in a row or maybe /followed by 6 numbers then you have your date. Further I would split the string into year, month day as numbers and use the yymmdd function. You may also try anydtdte. informat.
Don't use the ANY* informats for strings where the date structure is not very clear. With the given string, you might end up with 2012-03-21 or 2021-03-12, depending on locale of the SAS session.
Since you can only use the ANY* informats reliably when the structure is clear already, you never use them, unless you want unpredictable results.
data want; set have; p=prxmatch('/(?<=\/)\d/',string); if p then want=substr(string,p,6); run;
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.