BookmarkSubscribeRSS Feed
spg
Obsidian | Level 7 spg
Obsidian | Level 7

Hi all,

 

When i use the index function to parse the following texts to yield "BAN", both the records are getting a hit, whereas it should ideally be only the first. 

Also, is there a way to specify the code to look for the word BAN after two underscores?

 

I'm using:

a=index(TYPE,"BAN");

 

TYPE

abcde_DT_BAN_OO_3MORE_LIF_CPM_728x90_abcde

xyz_ab_RMBAN_PR_BHV_NOGEN_DCPM_160x600_uvwxyz

 

Thanks.

 

 

5 REPLIES 5
PGStats
Opal | Level 21

a = index(TYPE, "_BAN_") + 1;

 

A much greater flexibility is provided with regular expression pattern matching (e.g. the PRXMATCH function).

PG
FreelanceReinh
Jade | Level 19

Hi @spg,

 

Both INDEX and FIND search for substrings, regardless whether they are part of a word. In contrast, the FINDW function could help you to avoid those unwanted hits:

a=findw(type,'BAN',,'ps');

(The modifiers p and s add punctuation marks [incl. the underscore] and space characters to the list of default delimiters.)

 

Searching for "__BAN" with the INDEX function would, of course, also find this substring in "__BANK". Again, you could try to apply FINDW, but you would have to choose the function arguments carefully. As PG suggested, the final resort might be the PRXMATCH function.

data_null__
Jade | Level 19

Search for the starting point then FIND from there.

data ban;
   input string $80.;
   /*find position of second underscore*/
   s = 0;
   do _n_ = 1 to 2;
      s = findc(string,'_',s+1); 
      end;
   /*Search for BAN*/
   a=find(string,"BAN",s);
   cards;
abBANcde_DT_BAN_OO_3MORE_LIF_CPM_728x90_abcde
xyz_ab_RMBAN_PR_BHV_NOGEN_DCPM_160x600_uvwxyz
one_underscore BAN
No underscore BAN
;;;;
   run;
proc print;
   run;

Capture.PNG

data_null__
Jade | Level 19

This is a version that uses FINDW to find BAN delimited by UNDERSCORE.

data ban;
   input string $80.;
   /*find position of second underscore*/
   s = 0;
   do _n_ = 1 to 2;
      s = findc(string,'_',s+1); 
      end;
   /*Search for BAN*/
   a=findw(string,"BAN",s,'_');
   cards;
abBANcde_DT_BAN_OO_3MORE_LIF_CPM_728x90_abcde
xyz_ab_RMBAN_PR_BHV_NOGEN_DCPM_160x600_uvwxyz
one_underscore BAN
No underscore BAN
;;;;
   run;
proc print;
   run;
spg
Obsidian | Level 7 spg
Obsidian | Level 7

Thank you all for the solutions provided! Much appreciated...

Adding the _ to the index function worked pretty well for my dataset. I am yet to try the code provided ny data_null_ 🙂

SAS INNOVATE 2024

Innovate_SAS_Blue.png

Registration is open! SAS is returning to Vegas for an AI and analytics experience like no other! Whether you're an executive, manager, end user or SAS partner, SAS Innovate is designed for everyone on your team. Register for just $495 by 12/31/2023.

If you are interested in speaking, there is still time to submit a session idea. More details are posted on the website. 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Get the $99 certification deal.jpg

 

 

Back in the Classroom!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 5 replies
  • 2651 views
  • 4 likes
  • 4 in conversation