DATA Step, Macro, Functions and more

Extracting Substrings Via PRX Functions

Accepted Solution Solved
Reply
Super Contributor
Posts: 297
Accepted Solution

Extracting Substrings Via PRX Functions

Hi Guys,

I have a string where the value I wish to extract varies from observation to observation.  I have created the following code as an example:

DATA STUFF;

STRING = '"><ahref"/browse/601/0/3"><imgsrc="/static/img/prev.gif"border="0"alt="Previous"/></a> <ahref="/browse/601/0/3">1</a> 2 <ahref="/browse/601/2/3">3</a> <ahref="/browse/601/3/3">4</a> <ahref="/browse/601/4/3">5</a> <ahref="/browse/601/5/3">6</a> <ahref="/browse/601/6/3">7</a> <ahref="/browse/601/7/3">8</a> <ahref="/browse/601/8/3">9</a> <ahref="/browse/601/9/3">10</a> <ahref="/browse/601/10/3">11</a> <ahref="/browse/601/11/3">12</a> <ahref="/browse/601/12/3">13</a> <ahref="/browse/601/13/3">14</a> <ahref="/browse/601/14/3">15</a> <ahref="/browse/601/15/3">16</a> <ahref="/browse/601/16/3">17</a> <ahref="/browse/601/17/3">18</a> <ahref="/browse/601/18/3">19</a> <ahref="/browse/601/19/3">20</a> <ahref="/browse/601/20/3">21</a> <ahref="/browse/601/21/3">22</a> <ahref="/browse/601/22/3">23</a> <ahref="/browse/601/23/3">24</a> <ahref="/browse/601/24/3">25</a> <ahref="/browse/601/25/3">26</a> <ahref="/browse/601/26/3">27</a> <ahref="/browse/601/27/3">28</a> <ahref="/browse/601/28/3">29</a> <ahref="/browse/601/29/3">30</a> <ahref="/browse/601/2/3"><imgsrc="/static/img/next.gif"border="0"alt="Next"/></a> ';

PATTERN = PRXPARSE('#"/browse/\d+\/\d+/\d+"><imgsrc="/static/img/next.gif"border="0"alt="Next"/></a> #');

CALL PRXSUBSTR(PATTERN,STRING,START,LENGTH);

SUB = SUBSTR(STRING,START,LENGTH);

RUN;

The result looks like sub = "/browse/601/2/3"><imgsrc="/static/img/next.gif"border="0"alt="Next"/></a> .  However I really want to return only the  "/browse/601/2/3" portion.  Obviously there are numerous ways to do this using non-prx functions, however I was hoping that there was a PRX method I could employ without creating a separate prx pattern to match this substring as below.

DATA STUFF;

LENGTH SUB $80 SUB1 $15;

STRING = '"><ahref"/browse/601/0/3"><imgsrc="/static/img/prev.gif"border="0"alt="Previous"/></a> <ahref="/browse/601/0/3">1</a> 2 <ahref="/browse/601/2/3">3</a> <ahref="/browse/601/3/3">4</a> <ahref="/browse/601/4/3">5</a> <ahref="/browse/601/5/3">6</a> <ahref="/browse/601/6/3">7</a> <ahref="/browse/601/7/3">8</a> <ahref="/browse/601/8/3">9</a> <ahref="/browse/601/9/3">10</a> <ahref="/browse/601/10/3">11</a> <ahref="/browse/601/11/3">12</a> <ahref="/browse/601/12/3">13</a> <ahref="/browse/601/13/3">14</a> <ahref="/browse/601/14/3">15</a> <ahref="/browse/601/15/3">16</a> <ahref="/browse/601/16/3">17</a> <ahref="/browse/601/17/3">18</a> <ahref="/browse/601/18/3">19</a> <ahref="/browse/601/19/3">20</a> <ahref="/browse/601/20/3">21</a> <ahref="/browse/601/21/3">22</a> <ahref="/browse/601/22/3">23</a> <ahref="/browse/601/23/3">24</a> <ahref="/browse/601/24/3">25</a> <ahref="/browse/601/25/3">26</a> <ahref="/browse/601/26/3">27</a> <ahref="/browse/601/27/3">28</a> <ahref="/browse/601/28/3">29</a> <ahref="/browse/601/29/3">30</a> <ahref="/browse/601/2/3"><imgsrc="/static/img/next.gif"border="0"alt="Next"/></a> ';

PATTERN = PRXPARSE('#"/browse/\d+\/\d+/\d+"><imgsrc="/static/img/next.gif"border="0"alt="Next"/></a> #');

PATTERN1 = PRXPARSE('#/browse/\d+\/\d+/\d+#');

CALL PRXSUBSTR(PATTERN,STRING,START,LENGTH);

SUB = SUBSTR(STRING,START,LENGTH);

CALL PRXSUBSTR(PATTERN1,SUB,START1,LENGTH1);

SUB1 = SUBSTR(SUB,START1,LENGTH1);

RUN;

Thank you very much for your help.

Regards,

Scott


Accepted Solutions
Solution
‎09-18-2013 02:20 AM
Trusted Advisor
Posts: 1,301

Re: Extracting Substrings Via PRX Functions

Posted in reply to Scott_Mitchell

Better to exploit capture-buffers available through PRX then doing a substr, try something like:

string='"><ahref"/browse/601/0/3"><imgsrc="/static/img/prev.gif"border="0"alt="Previous"/></a> <ahref="/browse/601/0/3">1</a> 2 <ahref="/browse/601/2/3">3</a> <ahref="/browse/601/3/3">4</a> <ahref="/browse/601/4/3">5</a> <ahref="/browse/601/5/3">6</a> <ahref="/browse/601/6/3">7</a> <ahref="/browse/601/7/3">8</a> <ahref="/browse/601/8/3">9</a> <ahref="/browse/601/9/3">10</a> <ahref="/browse/601/10/3">11</a> <ahref="/browse/601/11/3">12</a> <ahref="/browse/601/12/3">13</a> <ahref="/browse/601/13/3">14</a> <ahref="/browse/601/14/3">15</a> <ahref="/browse/601/15/3">16</a> <ahref="/browse/601/16/3">17</a> <ahref="/browse/601/17/3">18</a> <ahref="/browse/601/18/3">19</a> <ahref="/browse/601/19/3">20</a> <ahref="/browse/601/20/3">21</a> <ahref="/browse/601/21/3">22</a> <ahref="/browse/601/22/3">23</a> <ahref="/browse/601/23/3">24</a> <ahref="/browse/601/24/3">25</a> <ahref="/browse/601/25/3">26</a> <ahref="/browse/601/26/3">27</a> <ahref="/browse/601/27/3">28</a> <ahref="/browse/601/28/3">29</a> <ahref="/browse/601/29/3">30</a> <ahref="/browse/601/2/3"><imgsrc="/static/img/next.gif"border="0"alt="Next"/></a> ';

prxid=prxparse('#(/browse/\d+/\d+/\d+)"><imgsrc#');

if prxmatch(prxid) then x=prxposn(prxid,1,string);

View solution in original post


All Replies
Solution
‎09-18-2013 02:20 AM
Trusted Advisor
Posts: 1,301

Re: Extracting Substrings Via PRX Functions

Posted in reply to Scott_Mitchell

Better to exploit capture-buffers available through PRX then doing a substr, try something like:

string='"><ahref"/browse/601/0/3"><imgsrc="/static/img/prev.gif"border="0"alt="Previous"/></a> <ahref="/browse/601/0/3">1</a> 2 <ahref="/browse/601/2/3">3</a> <ahref="/browse/601/3/3">4</a> <ahref="/browse/601/4/3">5</a> <ahref="/browse/601/5/3">6</a> <ahref="/browse/601/6/3">7</a> <ahref="/browse/601/7/3">8</a> <ahref="/browse/601/8/3">9</a> <ahref="/browse/601/9/3">10</a> <ahref="/browse/601/10/3">11</a> <ahref="/browse/601/11/3">12</a> <ahref="/browse/601/12/3">13</a> <ahref="/browse/601/13/3">14</a> <ahref="/browse/601/14/3">15</a> <ahref="/browse/601/15/3">16</a> <ahref="/browse/601/16/3">17</a> <ahref="/browse/601/17/3">18</a> <ahref="/browse/601/18/3">19</a> <ahref="/browse/601/19/3">20</a> <ahref="/browse/601/20/3">21</a> <ahref="/browse/601/21/3">22</a> <ahref="/browse/601/22/3">23</a> <ahref="/browse/601/23/3">24</a> <ahref="/browse/601/24/3">25</a> <ahref="/browse/601/25/3">26</a> <ahref="/browse/601/26/3">27</a> <ahref="/browse/601/27/3">28</a> <ahref="/browse/601/28/3">29</a> <ahref="/browse/601/29/3">30</a> <ahref="/browse/601/2/3"><imgsrc="/static/img/next.gif"border="0"alt="Next"/></a> ';

prxid=prxparse('#(/browse/\d+/\d+/\d+)"><imgsrc#');

if prxmatch(prxid) then x=prxposn(prxid,1,string);

Super Contributor
Posts: 297

Re: Extracting Substrings Via PRX Functions

Thanks guys, that was perfect.

Respected Advisor
Posts: 4,173

Re: Extracting Substrings Via PRX Functions

Posted in reply to Scott_Mitchell

you could try and use look ahead and look behind as part of your RegEx.

🔒 This topic is solved and locked.

Need further help from the community? Please ask a new question.

Discussion stats
  • 3 replies
  • 304 views
  • 5 likes
  • 3 in conversation