Determine Starting Position of multiple substrings in string

Accepted Solution Solved
Reply
Super Contributor
Posts: 418
Accepted Solution

Determine Starting Position of multiple substrings in string

Hello everyone. I have a unique request that I am struggling to complete.  Basically I have a string similar to below.

mmm death mmm death mmm death oijoijasdfoji death.

What I need is a new variable that is a itself a string of the starting positions of the word death from this string (either by start position or by word position within string).

Ex:

I need a variable that has the following from this string.

String position of word "death".

4,14,24,44

Word position of word "death".

2,4,6,8

So my final dataset would look as following.

data answer;

stringvar= 'mmm death mmm death mmm death oijoijasdfoji death';

positionanswer='4,14,24,44';

wordpositionanswer='2,4,6,8';

run;

Clearly this is hard-coded but the columns "positionanswer and wordpositionanswer" are the results I am looking for!

Thanks

Brandon


Accepted Solutions
Solution
‎04-02-2014 04:59 PM
Respected Advisor
Posts: 4,925

Re: Determine Starting Position of multiple substrings in string

Posted in reply to Anotherdream

Use CALL SCAN :

data test;

line = "  mmm death mmm death mmm death oijoijasdfoji death  ";

length strPos wrdPos $200;

do i = 1 by 1;

       call scan(line, i, pos, len);

       if pos = 0 then leave;

       if substr(line,pos,len) = "death" then do;

            strPos = catx(",", strPos, pos);

            wrdPos = catx(",", wrdPos, i);

            end;

       end;

drop i pos len;

run;

proc print data=test noobs; run; 

PG

PG

View solution in original post


All Replies
Solution
‎04-02-2014 04:59 PM
Respected Advisor
Posts: 4,925

Re: Determine Starting Position of multiple substrings in string

Posted in reply to Anotherdream

Use CALL SCAN :

data test;

line = "  mmm death mmm death mmm death oijoijasdfoji death  ";

length strPos wrdPos $200;

do i = 1 by 1;

       call scan(line, i, pos, len);

       if pos = 0 then leave;

       if substr(line,pos,len) = "death" then do;

            strPos = catx(",", strPos, pos);

            wrdPos = catx(",", wrdPos, i);

            end;

       end;

drop i pos len;

run;

proc print data=test noobs; run; 

PG

PG
Super Contributor
Posts: 418

Re: Determine Starting Position of multiple substrings in string

Oh my I was not aware of the call scan routine. That is the DEFINITION of what I was looking for!

THank you so much!

Brandon

Respected Advisor
Posts: 3,799

Re: Determine Starting Position of multiple substrings in string

Posted in reply to Anotherdream

Alternatively use FINDW.

data findW;
   line = "  mmm death mmm death mmm death oijoijasdfoji death  ";
   word =
'DEATH';
  
length strPos wrdPos $200;
   c = findw(line,word,
' ','I',1);
   w = findw(line,word,' ','IE',1);
   do while(c gt 0);
      strPos = catx(',',strpos,c);
      wrdpos = catx(',',wrdpos,w);
      w = w+findw(line,word,' ','IE',c+1);
      c = findw(line,word,' ','I',c+1);
      end;
  
drop c w;
   run;
🔒 This topic is solved and locked.

Need further help from the community? Please ask a new question.

Discussion stats
  • 3 replies
  • 286 views
  • 0 likes
  • 3 in conversation