Help using Base SAS procedures

Index/Find function not working for parsing specific text

Reply
Contributor spg
Contributor
Posts: 61

Index/Find function not working for parsing specific text

Hi all,

 

When i use the index function to parse the following texts to yield "BAN", both the records are getting a hit, whereas it should ideally be only the first. 

Also, is there a way to specify the code to look for the word BAN after two underscores?

 

I'm using:

a=index(TYPE,"BAN");

 

TYPE

abcde_DT_BAN_OO_3MORE_LIF_CPM_728x90_abcde

xyz_ab_RMBAN_PR_BHV_NOGEN_DCPM_160x600_uvwxyz

 

Thanks.

 

 

Respected Advisor
Posts: 4,925

Re: Index/Find function not working for parsing specific text

a = index(TYPE, "_BAN_") + 1;

 

A much greater flexibility is provided with regular expression pattern matching (e.g. the PRXMATCH function).

PG
Trusted Advisor
Posts: 1,117

Re: Index/Find function not working for parsing specific text

Hi @spg,

 

Both INDEX and FIND search for substrings, regardless whether they are part of a word. In contrast, the FINDW function could help you to avoid those unwanted hits:

a=findw(type,'BAN',,'ps');

(The modifiers p and s add punctuation marks [incl. the underscore] and space characters to the list of default delimiters.)

 

Searching for "__BAN" with the INDEX function would, of course, also find this substring in "__BANK". Again, you could try to apply FINDW, but you would have to choose the function arguments carefully. As PG suggested, the final resort might be the PRXMATCH function.

Respected Advisor
Posts: 3,799

Re: Index/Find function not working for parsing specific text

Search for the starting point then FIND from there.

data ban;
   input string $80.;
   /*find position of second underscore*/
   s = 0;
   do _n_ = 1 to 2;
      s = findc(string,'_',s+1); 
      end;
   /*Search for BAN*/
   a=find(string,"BAN",s);
   cards;
abBANcde_DT_BAN_OO_3MORE_LIF_CPM_728x90_abcde
xyz_ab_RMBAN_PR_BHV_NOGEN_DCPM_160x600_uvwxyz
one_underscore BAN
No underscore BAN
;;;;
   run;
proc print;
   run;

Capture.PNG

Respected Advisor
Posts: 3,799

Re: Index/Find function not working for parsing specific text

Posted in reply to data_null__

This is a version that uses FINDW to find BAN delimited by UNDERSCORE.

data ban;
   input string $80.;
   /*find position of second underscore*/
   s = 0;
   do _n_ = 1 to 2;
      s = findc(string,'_',s+1); 
      end;
   /*Search for BAN*/
   a=findw(string,"BAN",s,'_');
   cards;
abBANcde_DT_BAN_OO_3MORE_LIF_CPM_728x90_abcde
xyz_ab_RMBAN_PR_BHV_NOGEN_DCPM_160x600_uvwxyz
one_underscore BAN
No underscore BAN
;;;;
   run;
proc print;
   run;
Contributor spg
Contributor
Posts: 61

Re: Index/Find function not working for parsing specific text

Posted in reply to data_null__

Thank you all for the solutions provided! Much appreciated...

Adding the _ to the index function worked pretty well for my dataset. I am yet to try the code provided ny data_null_ Smiley Happy

Ask a Question
Discussion stats
  • 5 replies
  • 608 views
  • 4 likes
  • 4 in conversation