DATA Step, Macro, Functions and more

Count presence of two-character string among many characters

Accepted Solution Solved
Reply
Frequent Contributor
Posts: 95
Accepted Solution

Count presence of two-character string among many characters

hi,

 

I need to count times string "RE" appears in multicharacter string. String is always char.

 

Example:

Data_Want     Column_have_id     Column_have_string

 

2                               01                   NDCSPCRCBCSLAFDWIWOFDSPDERRDETDRECADETDRECA

1                               02                   ENDCSPCRCBCSLAFDWIWOFDSPDERRDETDRECADETD

1                               03                   ENDCSPCRCBCSLAFDWIWOFDSPDERRDETDRECADETD

0                               04                   ENDCSPCRCBCSLAFDWIWOFDSP

 

In the first string of data in the above example, "RE" is read twice. ("RE" should be captured as is, not "ER"...).

 

thank you


Accepted Solutions
Solution
‎03-07-2017 08:34 AM
Super User
Super User
Posts: 7,970

Re: Count presence of two-character string among many characters

Just loop over the characters 1 by 1, and if char + 1 = RE increment count:

data have;
  length s $2000;
  input s $;
datalines; 
NDCSPCRCBCSLAFDWIWOFDSPDERRDETDRECADETDRECA
ENDCSPCRCBCSLAFDWIWOFDSPDERRDETDRECADETD
;
run;

data want (drop=i);
  set have;
  count=0;
  do i=1 to lengthn(s)-1;
    if cats(char(s,i),char(s,i+1))="RE" then count=sum(count,1);
  end;
run;

View solution in original post


All Replies
Solution
‎03-07-2017 08:34 AM
Super User
Super User
Posts: 7,970

Re: Count presence of two-character string among many characters

Just loop over the characters 1 by 1, and if char + 1 = RE increment count:

data have;
  length s $2000;
  input s $;
datalines; 
NDCSPCRCBCSLAFDWIWOFDSPDERRDETDRECADETDRECA
ENDCSPCRCBCSLAFDWIWOFDSPDERRDETDRECADETD
;
run;

data want (drop=i);
  set have;
  count=0;
  do i=1 to lengthn(s)-1;
    if cats(char(s,i),char(s,i+1))="RE" then count=sum(count,1);
  end;
run;
Frequent Contributor
Posts: 95

Re: Count presence of two-character string among many characters

this is very helpful, thanks again
Respected Advisor
Posts: 3,799

Re: Count presence of two-character string among many characters


brulard wrote:

hi,

 

I need to count times string "RE" appears in multicharacter string. String is always char.

 

Example:

Data_Want     Column_have_id     Column_have_string

 

2                               01                   NDCSPCRCBCSLAFDWIWOFDSPDERRDETDRECADETDRECA

1                               02                   ENDCSPCRCBCSLAFDWIWOFDSPDERRDETDRECADETD

1                               03                   ENDCSPCRCBCSLAFDWIWOFDSPDERRDETDRECADETD

0                               04                   ENDCSPCRCBCSLAFDWIWOFDSP

 

In the first string of data in the above example, "RE" is read twice. ("RE" should be captured as is, not "ER"...).

 

thank you


The FIND function is a good choice for this problem.

 

data re;
   input s $80.;
   count=0;
   do x=1 by 0 until(f eq 0);
      f=find(s,'RE',x);
      if f then count+1;
      x=f+2;
      end;
   drop x f;
   cards;
NDCSPCRCBCSLAFDWIWOFDSPDERRDETDRECADETDRECA
ENDCSPCRCBCSLAFDWIWOFDSPDERRDETDRECADETD
ENDCSPCRCBCSLAFDWIWOFDSPDERRDETDRECADETD
ENDCSPCRCBCSLAFDWIWOFDSP
;;;;
   run;
proc print;
   run;

Capture.PNG

Super User
Posts: 10,035

Re: Count presence of two-character string among many characters

Can you use COUNT()?


data have;
  length s $2000;
  input s $;
datalines; 
NDCSPCRCBCSLAFDWIWOFDSPDERRDETDRECADETDRECA
ENDCSPCRCBCSLAFDWIWOFDSPDERRDETDRECADETD
;
run;

data want (drop=i);
  set have;
  count=0;
  do i=1 to lengthn(s)-1;
    if cats(char(s,i),char(s,i+1))="RE" then count=sum(count,1);
  end;
  
  c=count(s,'RE');
  
run;
proc print;run;


Frequent Contributor
Posts: 95

Re: Count presence of two-character string among many characters

that works too. Thanks Ksharp

☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 5 replies
  • 148 views
  • 2 likes
  • 4 in conversation