BookmarkSubscribeRSS Feed
spg
Obsidian | Level 7 spg
Obsidian | Level 7

Hi all,

 

When i use the index function to parse the following texts to yield "BAN", both the records are getting a hit, whereas it should ideally be only the first. 

Also, is there a way to specify the code to look for the word BAN after two underscores?

 

I'm using:

a=index(TYPE,"BAN");

 

TYPE

abcde_DT_BAN_OO_3MORE_LIF_CPM_728x90_abcde

xyz_ab_RMBAN_PR_BHV_NOGEN_DCPM_160x600_uvwxyz

 

Thanks.

 

 

5 REPLIES 5
PGStats
Opal | Level 21

a = index(TYPE, "_BAN_") + 1;

 

A much greater flexibility is provided with regular expression pattern matching (e.g. the PRXMATCH function).

PG
FreelanceReinh
Jade | Level 19

Hi @spg,

 

Both INDEX and FIND search for substrings, regardless whether they are part of a word. In contrast, the FINDW function could help you to avoid those unwanted hits:

a=findw(type,'BAN',,'ps');

(The modifiers p and s add punctuation marks [incl. the underscore] and space characters to the list of default delimiters.)

 

Searching for "__BAN" with the INDEX function would, of course, also find this substring in "__BANK". Again, you could try to apply FINDW, but you would have to choose the function arguments carefully. As PG suggested, the final resort might be the PRXMATCH function.

data_null__
Jade | Level 19

Search for the starting point then FIND from there.

data ban;
   input string $80.;
   /*find position of second underscore*/
   s = 0;
   do _n_ = 1 to 2;
      s = findc(string,'_',s+1); 
      end;
   /*Search for BAN*/
   a=find(string,"BAN",s);
   cards;
abBANcde_DT_BAN_OO_3MORE_LIF_CPM_728x90_abcde
xyz_ab_RMBAN_PR_BHV_NOGEN_DCPM_160x600_uvwxyz
one_underscore BAN
No underscore BAN
;;;;
   run;
proc print;
   run;

Capture.PNG

data_null__
Jade | Level 19

This is a version that uses FINDW to find BAN delimited by UNDERSCORE.

data ban;
   input string $80.;
   /*find position of second underscore*/
   s = 0;
   do _n_ = 1 to 2;
      s = findc(string,'_',s+1); 
      end;
   /*Search for BAN*/
   a=findw(string,"BAN",s,'_');
   cards;
abBANcde_DT_BAN_OO_3MORE_LIF_CPM_728x90_abcde
xyz_ab_RMBAN_PR_BHV_NOGEN_DCPM_160x600_uvwxyz
one_underscore BAN
No underscore BAN
;;;;
   run;
proc print;
   run;
spg
Obsidian | Level 7 spg
Obsidian | Level 7

Thank you all for the solutions provided! Much appreciated...

Adding the _ to the index function worked pretty well for my dataset. I am yet to try the code provided ny data_null_ 🙂

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 5 replies
  • 3317 views
  • 4 likes
  • 4 in conversation