<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Extracting a phrase using Regex (PRXNEXT) in SAS Programming</title>
    <link>https://communities.sas.com/t5/SAS-Programming/Extracting-a-phrase-using-Regex-PRXNEXT/m-p/779178#M248131</link>
    <description>Thank you ChrisNZ - not sure how I missed this but it was indeed #3.</description>
    <pubDate>Mon, 08 Nov 2021 18:10:57 GMT</pubDate>
    <dc:creator>BrianB4233</dc:creator>
    <dc:date>2021-11-08T18:10:57Z</dc:date>
    <item>
      <title>Extracting a phrase using Regex (PRXNEXT)</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Extracting-a-phrase-using-Regex-PRXNEXT/m-p/778060#M247648</link>
      <description>&lt;P&gt;Hi all,&lt;/P&gt;&lt;P&gt;I've written what I thought was some solid code to extract multiple dates, and the preceding 18 words/non-words, from a free text field. I'm using PRXNEXT b/c there are often multiple dates within the text field and I'd like to extract all of them. However, testing this in&amp;nbsp;&lt;A href="https://regex101.com/" target="_blank" rel="noopener"&gt;https://regex101.com/&lt;/A&gt;&amp;nbsp;and then viewing the results doesn't result in a match. It is correctly identifying, and outputting, the date using PRXPOSN but it's not including all of the words/non-words preceding the date.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;What is being output in the temp dataset is this:&lt;/P&gt;&lt;BLOCKQUOTE&gt;&lt;P&gt;year: 2.9 %..........[average&lt;BR /&gt;woman &amp;lt;1.67%]&lt;BR /&gt;NCI Lifetime: 15.1 %..........[average&lt;BR /&gt;woman &amp;lt;10%]&lt;BR /&gt;&lt;BR /&gt;A&lt;/P&gt;&lt;/BLOCKQUOTE&gt;&lt;P&gt;Whereas in regex101 it's showing this:&amp;nbsp;&lt;A href="https://regex101.com/r/LWRcqN/1" target="_blank" rel="noopener"&gt;https://regex101.com/r/LWRcqN/1&lt;/A&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;PRE&gt;&lt;CODE class=""&gt;data data_chk1;

length dt_1-dt_12 $150 
dt_out1-dt_out12 $30
imp_rep_concat $11000
;

set work.birad_score_0_3(drop=cht_in impressiontext reporttext obs=max);

/* Combine impression &amp;amp; report text together to search as one */
imp_rep_concat = catx(' REPORT_TEXT ',impression_copy,report_copy);

*** Identifies ddOctdd or dOctdddd or ddOctdddd as well if there is a
space/hyphen/whatever between the day &amp;amp; month or month &amp;amp; year;
if _n_ = 1 then do;
retain dt_pattern;
 dt_pattern = prxparse("/(?:\w+\W+){0,18}(\d{1,2}(\.|\/|-)\d{1,2}(\.|\/|-)\d{2,4})/i");
end;

/*if prxmatch(dt_pattern,impression_copy) then do;*/
/*match = 1;*/
/* date_out = prxposn(dt_pattern,1,impression_copy);*/
/*end;*/

start = 1;
stop = length(imp_rep_concat);

call prxnext(dt_pattern,start,stop,imp_rep_concat,pos,len);
	array comm[12] $dt_1-dt_12;
	array comm1[12] $dt_out1-dt_out12;
	do i = 1 to 12 while (pos &amp;gt; 0);
		comm(i) = substr(imp_rep_concat,pos,len);
		comm1(i) = prxPosn(dt_pattern, 1, imp_rep_concat);
 call prxnext(dt_pattern,start,stop,imp_rep_concat,pos,len);
end;


*drop dt_1-dt_12 dt_pattern: start: stop: pos len i;
run;&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;Any ideas what is causing the inconsistency? Thank you.&lt;/P&gt;</description>
      <pubDate>Tue, 02 Nov 2021 23:01:32 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Extracting-a-phrase-using-Regex-PRXNEXT/m-p/778060#M247648</guid>
      <dc:creator>BrianB4233</dc:creator>
      <dc:date>2021-11-02T23:01:32Z</dc:date>
    </item>
    <item>
      <title>Re: Extracting a phrase using Regex (PRXNEXT)</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Extracting-a-phrase-using-Regex-PRXNEXT/m-p/778096#M247659</link>
      <description>&lt;P&gt;1. The link you sent uses extensions gmi. SAS does not support option g, and you only use i in your code.&lt;/P&gt;
&lt;P&gt;2. In any case only using i gives the same result on regex101.com&lt;/P&gt;
&lt;P&gt;3. The result is the same in SAS and regex101 except that the SAS result is truncated at length 200. Try lengthening the variable.&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Wed, 03 Nov 2021 07:39:58 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Extracting-a-phrase-using-Regex-PRXNEXT/m-p/778096#M247659</guid>
      <dc:creator>ChrisNZ</dc:creator>
      <dc:date>2021-11-03T07:39:58Z</dc:date>
    </item>
    <item>
      <title>Re: Extracting a phrase using Regex (PRXNEXT)</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Extracting-a-phrase-using-Regex-PRXNEXT/m-p/779178#M248131</link>
      <description>Thank you ChrisNZ - not sure how I missed this but it was indeed #3.</description>
      <pubDate>Mon, 08 Nov 2021 18:10:57 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Extracting-a-phrase-using-Regex-PRXNEXT/m-p/779178#M248131</guid>
      <dc:creator>BrianB4233</dc:creator>
      <dc:date>2021-11-08T18:10:57Z</dc:date>
    </item>
  </channel>
</rss>

