Hello, I have a 9 character data field and I need to identify if any 3 of the consecutive characters in that string are the same. Any ideas on how to achieve this?
Thanks.
Pramodini
Assuming you want an exact match on any possible character, here is a way:
data want;
set have;
do i=1 to 7 until (flag=1);
if substr(var, i, 1) = substr(var, i+1, 1) = substr(var, i+2, 1) then flag=1;
end;
run;
Good luck.
could you please give an example data
Does case matter? For example does "aAa" count as the "same character"? Are any of the characters special characters such as punctuation, ()!@#$%^&*-_=+/><\|][{}
Do multiple spaces count?
Assuming you want an exact match on any possible character, here is a way:
data want;
set have;
do i=1 to 7 until (flag=1);
if substr(var, i, 1) = substr(var, i+1, 1) = substr(var, i+2, 1) then flag=1;
end;
run;
Good luck.
Hey. I was getting the correct result but I'm getting a warning too. So I modified a bit and tried but can't figure it out.
Would it be possible for you to tell me why I am getting the below warning and how can I fix it.
Thanks.
data have;
var="Sasexampleee";output;
var="Sasexampplee";output;
var="Sasexammmppl";output;
var="Sasexampl";output;
run;
data want;
set have;
leng=length(var);
do i=1 to leng;
if substr(var, i, 1) = substr(var, i+1, 1) = substr(var, i+2, 1) then flag=1;
end;
run;
NOTE: Invalid second argument to function SUBSTR at line 27 column 52.
NOTE: Invalid second argument to function SUBSTR at line 27 column 30.
NOTE: Invalid second argument to function SUBSTR at line 27 column 52.
var=Sasexampplee leng=12 i=13 flag=. _ERROR_=1 _N_=2
NOTE: Invalid second argument to function SUBSTR at line 27 column 52.
var=Sasexamplee leng=11 i=12 flag=. _ERROR_=1 _N_=4
NOTE: There were 4 observations read from the data set WORK.HAVE.
When i = leng in your loop, you look at character position i+2 which is beyond the length of the variable. You can simply stop the loop sooner:
data want;
set have;
leng=length(var) - 2;
do i=1 to leng;
if substr(var, i, 1) = substr(var, i+1, 1) = substr(var, i+2, 1) then flag=1;
end;
run;
PG
Worked well. Thanks.
Also you may try,
data have;
char='a b b a c c c';
do i = 1 to 10;
new2=scan(char,i,' ');
output;
end;
run;
data want;
set have;
by notsorted new2;
retain count 0;
if first.new2 then count=1;
else count+1;
if last.new2 and count=3 and new2 ne '';
run;
Thanks,
Jag
THANK YOU all!!! This works.
Appreciate all the help.
Come on, there has to be PRX solution for this
data test;
infile cards truncover;
input str $ 100.;
flag=prxmatch('m/(\S)\1{2}/o',str)>0;
cards;
adlsfkj888adklfj
alkjahfkldjhklajdhfklj
akljsd******alkdfkj
;
Here is an explaination of the regular expression:
(\S)\1{2}
Match the regular expression below and capture its match into backreference number 1 «(\S)»
Match a single character that is a “non-whitespace character” «\S»
Match the same text as most recently matched by capturing group number 1 «\1{2}»
Exactly 2 times «{2}»
The m before the starting delimiter (/) I would say is not relevant here.
The o after the ending delimiter (/) is an optimizer that tells SAS the expression can be held and reused without recompilation throughout the execution of the data step.
See the documentation:
http://support.sas.com/documentation/cdl/en/lrdict/64316/HTML/default/viewer.htm#a002295977.htm
If perl-regular-expression is a constant or if it uses the /o option, the Perl regular expression is compiled only once. Successive calls to PRXPARSE will not cause a recompile, but will return the regular-expression-id for the regular expression that was already compiled. This behavior simplifies the code because you do not need to use an initialization block (IF _N_ =1) to initialize Perl regular expressions.
Note: If you have a Perl regular expression that is a constant, or if the regular expression uses the /o option, then calling PRXFREE to free the memory allocation results in the need to recompile the regular expression the next time that it is called by PRXPARSE.
The compile-once behavior occurs when you use PRXPARSE in a DATA step. For all other uses, the perl-regular-expression is recompiled for each call to PRXPARSE.
naveen_srini,
Rather than taking this post off-topic, it would be more prudent to post a new question. I will gladly answer you to the best that I can, as will others, I'm sure.
Does anyone know if the "compile-once" behavior also occurs in SQL expressions?
Pg
PG, it does have a similar behavior in SQL
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.