BookmarkSubscribeRSS Feed
spg
Obsidian | Level 7 spg
Obsidian | Level 7

Hi all,

 

When i use the index function to parse the following texts to yield "BAN", both the records are getting a hit, whereas it should ideally be only the first. 

Also, is there a way to specify the code to look for the word BAN after two underscores?

 

I'm using:

a=index(TYPE,"BAN");

 

TYPE

abcde_DT_BAN_OO_3MORE_LIF_CPM_728x90_abcde

xyz_ab_RMBAN_PR_BHV_NOGEN_DCPM_160x600_uvwxyz

 

Thanks.

 

 

5 REPLIES 5
PGStats
Opal | Level 21

a = index(TYPE, "_BAN_") + 1;

 

A much greater flexibility is provided with regular expression pattern matching (e.g. the PRXMATCH function).

PG
FreelanceReinh
Jade | Level 19

Hi @spg,

 

Both INDEX and FIND search for substrings, regardless whether they are part of a word. In contrast, the FINDW function could help you to avoid those unwanted hits:

a=findw(type,'BAN',,'ps');

(The modifiers p and s add punctuation marks [incl. the underscore] and space characters to the list of default delimiters.)

 

Searching for "__BAN" with the INDEX function would, of course, also find this substring in "__BANK". Again, you could try to apply FINDW, but you would have to choose the function arguments carefully. As PG suggested, the final resort might be the PRXMATCH function.

data_null__
Jade | Level 19

Search for the starting point then FIND from there.

data ban;
   input string $80.;
   /*find position of second underscore*/
   s = 0;
   do _n_ = 1 to 2;
      s = findc(string,'_',s+1); 
      end;
   /*Search for BAN*/
   a=find(string,"BAN",s);
   cards;
abBANcde_DT_BAN_OO_3MORE_LIF_CPM_728x90_abcde
xyz_ab_RMBAN_PR_BHV_NOGEN_DCPM_160x600_uvwxyz
one_underscore BAN
No underscore BAN
;;;;
   run;
proc print;
   run;

Capture.PNG

data_null__
Jade | Level 19

This is a version that uses FINDW to find BAN delimited by UNDERSCORE.

data ban;
   input string $80.;
   /*find position of second underscore*/
   s = 0;
   do _n_ = 1 to 2;
      s = findc(string,'_',s+1); 
      end;
   /*Search for BAN*/
   a=findw(string,"BAN",s,'_');
   cards;
abBANcde_DT_BAN_OO_3MORE_LIF_CPM_728x90_abcde
xyz_ab_RMBAN_PR_BHV_NOGEN_DCPM_160x600_uvwxyz
one_underscore BAN
No underscore BAN
;;;;
   run;
proc print;
   run;
spg
Obsidian | Level 7 spg
Obsidian | Level 7

Thank you all for the solutions provided! Much appreciated...

Adding the _ to the index function worked pretty well for my dataset. I am yet to try the code provided ny data_null_ 🙂

sas-innovate-2024.png

📢

ANNOUNCEMENT

The early bird rate has been extended! Register by March 18 for just $695 - $100 off the standard rate.

 

Check out the agenda and get ready for a jam-packed event featuring workshops, super demos, breakout sessions, roundtables, inspiring keynotes and incredible networking events. 

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 5 replies
  • 2783 views
  • 4 likes
  • 4 in conversation