## How to identify repeating characters in the string.

Solved
Occasional Contributor
Posts: 15

# How to identify repeating characters in the string.

Hello, I have a 9 character data field and I need to identify if any 3 of the consecutive characters in that string are the same.  Any ideas on how to achieve this?

Thanks.

Pramodini

Accepted Solutions
Solution
‎03-26-2015 10:46 AM
Super User
Posts: 6,785

## Re: How to identify repeating characters in the string.

Assuming  you want an exact match on any possible character, here is a way:

data want;

set have;

do i=1 to 7 until (flag=1);

if substr(var, i, 1) = substr(var, i+1, 1) = substr(var, i+2, 1) then flag=1;

end;

run;

Good luck.

All Replies
Posts: 1,147

## Re: How to identify repeating characters in the string.

could you please give an example data

Thanks,
Jag
Super User
Posts: 13,583

## Re: How to identify repeating characters in the string.

Does case matter? For example does "aAa" count as the "same character"? Are any of the characters special characters such as punctuation, ()!@#\$%^&*-_=+/><\|][{}

Do multiple spaces count?

Solution
‎03-26-2015 10:46 AM
Super User
Posts: 6,785

## Re: How to identify repeating characters in the string.

Assuming  you want an exact match on any possible character, here is a way:

data want;

set have;

do i=1 to 7 until (flag=1);

if substr(var, i, 1) = substr(var, i+1, 1) = substr(var, i+2, 1) then flag=1;

end;

run;

Good luck.

Contributor
Posts: 24

## Re: How to identify repeating characters in the string.

Hey. I was getting the correct result but I'm getting a warning too. So I modified a bit and tried but can't figure it out.

Would it be possible for you to tell me why I am getting the below warning and how can I fix it.

Thanks.

data have;

var="Sasexampleee";output;

var="Sasexampplee";output;

var="Sasexammmppl";output;

var="Sasexampl";output;

run;

data want;

set have;

leng=length(var);

do i=1 to leng;

if substr(var, i, 1) = substr(var, i+1, 1) = substr(var, i+2, 1) then flag=1;

end;

run;

NOTE: Invalid second argument to function SUBSTR at line 27 column 52.

NOTE: Invalid second argument to function SUBSTR at line 27 column 30.

NOTE: Invalid second argument to function SUBSTR at line 27 column 52.

var=Sasexampplee leng=12 i=13 flag=. _ERROR_=1 _N_=2

NOTE: Invalid second argument to function SUBSTR at line 27 column 52.

var=Sasexamplee leng=11 i=12 flag=. _ERROR_=1 _N_=4

NOTE: There were 4 observations read from the data set WORK.HAVE.

Posts: 5,543

## Re: How to identify repeating characters in the string.

When i = leng in your loop, you look at character position i+2 which is beyond the length of the variable. You can simply stop the loop sooner:

data want;

set have;

leng=length(var) - 2;

do i=1 to leng;

if substr(var, i, 1) = substr(var, i+1, 1) = substr(var, i+2, 1) then flag=1;

end;

run;

PG

PG
Contributor
Posts: 24

## Re: How to identify repeating characters in the string.

Worked well. Thanks.

Posts: 1,147

## Re: How to identify repeating characters in the string.

Also you may try,

data have;

char='a b b a c c c';

do     i = 1 to 10;

new2=scan(char,i,' ');

output;

end;

run;

data want;

set have;

by notsorted new2;

retain count 0;

if first.new2 then count=1;

else count+1;

if last.new2 and count=3 and new2 ne '';

run;

Thanks,
Jag

Thanks,
Jag
Occasional Contributor
Posts: 15

## Re: How to identify repeating characters in the string.

THANK YOU all!!!  This works.

Appreciate all the help.

Posts: 3,167

## Re: How to identify repeating characters in the string.

Come on, there has to be PRX solution for this

data test;

infile cards truncover;

input str \$ 100.;

flag=prxmatch('m/(\S)\1{2}/o',str)>0;

cards;

alkjahfkldjhklajdhfklj

akljsd******alkdfkj

;

Posts: 1,318

## Re: How to identify repeating characters in the string.

Here is an explaination of the regular expression:

(\S)\1{2}

Match the regular expression below and capture its match into backreference number 1 «(\S)»

Match a single character that is a “non-whitespace character” «\S»

Match the same text as most recently matched by capturing group number 1 «\1{2}»

Exactly 2 times «{2}»

The m before the starting delimiter (/) I would say is not relevant here.

The o after the ending delimiter (/) is an optimizer that tells SAS the expression can be held and reused without recompilation throughout the execution of the data step.

See the documentation:

### Compiling a Perl Regular Expression

```If perl-regular-expression is a constant or if it uses the /o option, the Perl regular expression is compiled only once. Successive calls to PRXPARSE will not cause a recompile, but will return the regular-expression-id for the regular expression that was already compiled. This behavior simplifies the code because you do not need to use an initialization block (IF _N_ =1) to initialize Perl regular expressions.
Note:   If you have a Perl regular expression that is a constant, or if the regular expression uses the /o option, then calling PRXFREE to free the memory allocation results in the need to recompile the regular expression the next time that it is called by PRXPARSE.
The compile-once behavior occurs when you use PRXPARSE in a DATA step. For all other uses, the perl-regular-expression is recompiled for each call to PRXPARSE.
```
Frequent Contributor
Posts: 115

## Re: How to identify repeating characters in the string.

Hi A small request and sorry it's off topic, I have noticed you using APPC functions extremely well as opposed many other major contributors. Can you please provide me a link documentation that explains pretty much well in detail for that one too? I'd appreciate it so much. Thanks.

Posts: 1,318

## Re: How to identify repeating characters in the string.

naveen_srini,

Rather than taking this post off-topic, it would be more prudent to post a new question.  I will gladly answer you to the best that I can, as will others, I'm sure.

Posts: 5,543

## Re: How to identify repeating characters in the string.

Does anyone know if the "compile-once" behavior also occurs in SQL expressions?

Pg

PG
Posts: 1,318

## Re: How to identify repeating characters in the string.

PG, it does have a similar behavior in SQL

🔒 This topic is solved and locked.