SAS Procedures

pr1 · Posted 03-26-2015 10:30 AM

Hello, I have a 9 character data field and I need to identify if any 3 of the consecutive characters in that string are the same. Any ideas on how to achieve this?

Thanks.

Pramodini

Astounding · Posted 03-26-2015 10:46 AM

Assuming you want an exact match on any possible character, here is a way:

data want;

set have;

do i=1 to 7 until (flag=1);

if substr(var, i, 1) = substr(var, i+1, 1) = substr(var, i+2, 1) then flag=1;

end;

run;

Good luck.

View solution in original post

Jagadishkatam · Posted 03-26-2015 10:36 AM

could you please give an example data

Thanks,
Jag

ballardw · Posted 03-26-2015 10:42 AM

Does case matter? For example does "aAa" count as the "same character"? Are any of the characters special characters such as punctuation, ()!@#$%^&*-_=+/><\|][{}

Do multiple spaces count?

Astounding · Posted 03-26-2015 10:46 AM

Assuming you want an exact match on any possible character, here is a way:

data want;

set have;

do i=1 to 7 until (flag=1);

if substr(var, i, 1) = substr(var, i+1, 1) = substr(var, i+2, 1) then flag=1;

end;

run;

Good luck.

BOBSAS · Posted 03-28-2015 09:45 PM

Hey. I was getting the correct result but I'm getting a warning too. So I modified a bit and tried but can't figure it out.

Would it be possible for you to tell me why I am getting the below warning and how can I fix it.

Thanks.

data have;

var="Sasexampleee";output;

var="Sasexampplee";output;

var="Sasexammmppl";output;

var="Sasexampl";output;

run;

data want;

set have;

leng=length(var);

do i=1 to leng;

if substr(var, i, 1) = substr(var, i+1, 1) = substr(var, i+2, 1) then flag=1;

end;

run;

NOTE: Invalid second argument to function SUBSTR at line 27 column 52.

NOTE: Invalid second argument to function SUBSTR at line 27 column 30.

NOTE: Invalid second argument to function SUBSTR at line 27 column 52.

var=Sasexampplee leng=12 i=13 flag=. _ERROR_=1 _N_=2

NOTE: Invalid second argument to function SUBSTR at line 27 column 52.

var=Sasexamplee leng=11 i=12 flag=. _ERROR_=1 _N_=4

NOTE: There were 4 observations read from the data set WORK.HAVE.

PGStats · Posted 03-28-2015 10:02 PM

When i = leng in your loop, you look at character position i+2 which is beyond the length of the variable. You can simply stop the loop sooner:

data want;

set have;

leng=length(var) - 2;

do i=1 to leng;

if substr(var, i, 1) = substr(var, i+1, 1) = substr(var, i+2, 1) then flag=1;

end;

run;

PG

BOBSAS · Posted 03-28-2015 10:10 PM

Worked well. Thanks.

Jagadishkatam · Posted 03-26-2015 10:50 AM

Also you may try,

data have;

char='a b b a c c c';

do i = 1 to 10;

new2=scan(char,i,' ');

output;

end;

run;

data want;

set have;

by notsorted new2;

retain count 0;

if first.new2 then count=1;

else count+1;

if last.new2 and count=3 and new2 ne '';

run;

Thanks,
Jag

pr1 · Posted 03-26-2015 11:16 AM

THANK YOU all!!! This works.

Appreciate all the help.

Haikuo · Posted 03-26-2015 11:18 AM

Come on, there has to be PRX solution for this

data test;

infile cards truncover;

input str $ 100.;

flag=prxmatch('m/(\S)\1{2}/o',str)>0;

cards;

adlsfkj888adklfj

alkjahfkldjhklajdhfklj

akljsd******alkdfkj

;

FriedEgg · Posted 03-26-2015 08:04 PM

Here is an explaination of the regular expression:

(\S)\1{2}

Match the regular expression below and capture its match into backreference number 1 «(\S)»

Match a single character that is a “non-whitespace character” «\S»

Match the same text as most recently matched by capturing group number 1 «\1{2}»

Exactly 2 times «{2}»

The m before the starting delimiter (/) I would say is not relevant here.

The o after the ending delimiter (/) is an optimizer that tells SAS the expression can be held and reused without recompilation throughout the execution of the data step.

See the documentation:

http://support.sas.com/documentation/cdl/en/lrdict/64316/HTML/default/viewer.htm#a002295977.htm

Compiling a Perl Regular Expression

If perl-regular-expression is a constant or if it uses the /o option, the Perl regular expression is compiled only once. Successive calls to PRXPARSE will not cause a recompile, but will return the regular-expression-id for the regular expression that was already compiled. This behavior simplifies the code because you do not need to use an initialization block (IF _N_ =1) to initialize Perl regular expressions.
Note:   If you have a Perl regular expression that is a constant, or if the regular expression uses the /o option, then calling PRXFREE to free the memory allocation results in the need to recompile the regular expression the next time that it is called by PRXPARSE.
The compile-once behavior occurs when you use PRXPARSE in a DATA step. For all other uses, the perl-regular-expression is recompiled for each call to PRXPARSE.

naveen_srini · Posted 03-27-2015 12:45 PM

Hi A small request and sorry it's off topic, I have noticed you using APPC functions extremely well as opposed many other major contributors. Can you please provide me a link documentation that explains pretty much well in detail for that one too? I'd appreciate it so much. Thanks.

FriedEgg · Posted 03-27-2015 03:53 PM

naveen_srini,

Rather than taking this post off-topic, it would be more prudent to post a new question. I will gladly answer you to the best that I can, as will others, I'm sure.

PGStats · Posted 03-27-2015 01:39 PM

Does anyone know if the "compile-once" behavior also occurs in SQL expressions?

Pg

PG

FriedEgg · Posted 03-27-2015 03:51 PM

PG, it does have a similar behavior in SQL

SAS Procedures

How to identify repeating characters in the string.

Re: How to identify repeating characters in the string.

Re: How to identify repeating characters in the string.

Re: How to identify repeating characters in the string.

Re: How to identify repeating characters in the string.

Re: How to identify repeating characters in the string.

Re: How to identify repeating characters in the string.

Re: How to identify repeating characters in the string.

Re: How to identify repeating characters in the string.

Re: How to identify repeating characters in the string.

Re: How to identify repeating characters in the string.

Re: How to identify repeating characters in the string.

Compiling a Perl Regular Expression

Re: How to identify repeating characters in the string.

Re: How to identify repeating characters in the string.

Re: How to identify repeating characters in the string.

Re: How to identify repeating characters in the string.

Need a way to identify non-USA keyboard characters in a string

Repeat After Me: Understanding Correlation Matrices in Repeated Measur...

Stripping non-specific characters from string by number

How to identify special characters in a character string?

Remove Character in String

Follow Us

What is...

SAS Procedures

Our biggest data and AI event of the year.

SAS Training: Just a Click Away

Follow Us

What is...