- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hello, I have a 9 character data field and I need to identify if any 3 of the consecutive characters in that string are the same. Any ideas on how to achieve this?
Thanks.
Pramodini
Accepted Solutions
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Assuming you want an exact match on any possible character, here is a way:
data want;
set have;
do i=1 to 7 until (flag=1);
if substr(var, i, 1) = substr(var, i+1, 1) = substr(var, i+2, 1) then flag=1;
end;
run;
Good luck.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
could you please give an example data
Jag
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Does case matter? For example does "aAa" count as the "same character"? Are any of the characters special characters such as punctuation, ()!@#$%^&*-_=+/><\|][{}
Do multiple spaces count?
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Assuming you want an exact match on any possible character, here is a way:
data want;
set have;
do i=1 to 7 until (flag=1);
if substr(var, i, 1) = substr(var, i+1, 1) = substr(var, i+2, 1) then flag=1;
end;
run;
Good luck.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hey. I was getting the correct result but I'm getting a warning too. So I modified a bit and tried but can't figure it out.
Would it be possible for you to tell me why I am getting the below warning and how can I fix it.
Thanks.
data have;
var="Sasexampleee";output;
var="Sasexampplee";output;
var="Sasexammmppl";output;
var="Sasexampl";output;
run;
data want;
set have;
leng=length(var);
do i=1 to leng;
if substr(var, i, 1) = substr(var, i+1, 1) = substr(var, i+2, 1) then flag=1;
end;
run;
NOTE: Invalid second argument to function SUBSTR at line 27 column 52.
NOTE: Invalid second argument to function SUBSTR at line 27 column 30.
NOTE: Invalid second argument to function SUBSTR at line 27 column 52.
var=Sasexampplee leng=12 i=13 flag=. _ERROR_=1 _N_=2
NOTE: Invalid second argument to function SUBSTR at line 27 column 52.
var=Sasexamplee leng=11 i=12 flag=. _ERROR_=1 _N_=4
NOTE: There were 4 observations read from the data set WORK.HAVE.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
When i = leng in your loop, you look at character position i+2 which is beyond the length of the variable. You can simply stop the loop sooner:
data want;
set have;
leng=length(var) - 2;
do i=1 to leng;
if substr(var, i, 1) = substr(var, i+1, 1) = substr(var, i+2, 1) then flag=1;
end;
run;
PG
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Worked well. Thanks.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Also you may try,
data have;
char='a b b a c c c';
do i = 1 to 10;
new2=scan(char,i,' ');
output;
end;
run;
data want;
set have;
by notsorted new2;
retain count 0;
if first.new2 then count=1;
else count+1;
if last.new2 and count=3 and new2 ne '';
run;
Thanks,
Jag
Jag
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
THANK YOU all!!! This works.
Appreciate all the help.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Come on, there has to be PRX solution for this
data test;
infile cards truncover;
input str $ 100.;
flag=prxmatch('m/(\S)\1{2}/o',str)>0;
cards;
adlsfkj888adklfj
alkjahfkldjhklajdhfklj
akljsd******alkdfkj
;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Here is an explaination of the regular expression:
(\S)\1{2}
Match the regular expression below and capture its match into backreference number 1 «(\S)»
Match a single character that is a “non-whitespace character” «\S»
Match the same text as most recently matched by capturing group number 1 «\1{2}»
Exactly 2 times «{2}»
The m before the starting delimiter (/) I would say is not relevant here.
The o after the ending delimiter (/) is an optimizer that tells SAS the expression can be held and reused without recompilation throughout the execution of the data step.
See the documentation:
http://support.sas.com/documentation/cdl/en/lrdict/64316/HTML/default/viewer.htm#a002295977.htm
Compiling a Perl Regular Expression
If perl-regular-expression is a constant or if it uses the /o option, the Perl regular expression is compiled only once. Successive calls to PRXPARSE will not cause a recompile, but will return the regular-expression-id for the regular expression that was already compiled. This behavior simplifies the code because you do not need to use an initialization block (IF _N_ =1) to initialize Perl regular expressions.
Note: If you have a Perl regular expression that is a constant, or if the regular expression uses the /o option, then calling PRXFREE to free the memory allocation results in the need to recompile the regular expression the next time that it is called by PRXPARSE.
The compile-once behavior occurs when you use PRXPARSE in a DATA step. For all other uses, the perl-regular-expression is recompiled for each call to PRXPARSE.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
naveen_srini,
Rather than taking this post off-topic, it would be more prudent to post a new question. I will gladly answer you to the best that I can, as will others, I'm sure.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Does anyone know if the "compile-once" behavior also occurs in SQL expressions?
Pg
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
PG, it does have a similar behavior in SQL