Hi,
I want to find position for each letter "a" in a series.
Is there any simple way like find(string,'a'2)?
Thank you,
HHCFX
data have;
input string :$50.;
datalines;
098ab-c1za3-8a1-8a-1a01212
;run;
data want;set have;
first_a=find(string,'a');
second_a=find(string,'a',2);
run;
data have;
input string :$50.;
datalines;
098ab-c1za3-8a1-8a-1a01212
;run;
data want;
set have;
do i=1 to lengthn(string);
if char(string,i)='a' then do;position=i;output;end;
end;
run;
data have;
input string :$50.;
datalines;
098ab-c1za3-8a1-8a-1a01212
;run;
data want;
set have;
a=0;
do pos=findc(string,'a') by 0 while (pos);
a+1;
output;
pos= findc(string,'a',pos+1) ;
end;
run;
novinosrin has provided the idea and I build further on his.
hhchenfx is in looking for a SAS function to provide the position and order of occurrence of 'a' in a string. I don't find a readymade SAS function for his desire. However, we can build a function using the SAS function compiler (FCMP). Here is a function:
proc fcmp outlib = work.cmput.str; function findpatnext(str $, pat $); file log; static lastpos 1; pos = findc(str, pat, 'it', lastpos); /* edit: added semicolon */ lastpos = pos + 1; return(pos); endsub; quit;
In a Data Step we can call the function as:
options cmplib = work.cmput; data want; length str $26; str = '098ab-c1za3-8a1-8a-1a01212'; loc = 1; do order = 1 by 1 while(loc); loc = findpatnext(str, 'a'); if loc then output; end; run;
The output Data Set, WANT:
Obs | str | loc | order |
---|---|---|---|
1 | 098ab-c1za3-8a1-8a-1a01212 | 4 | 1 |
2 | 098ab-c1za3-8a1-8a-1a01212 | 10 | 2 |
3 | 098ab-c1za3-8a1-8a-1a01212 | 14 | 3 |
4 | 098ab-c1za3-8a1-8a-1a01212 | 18 | 4 |
5 | 098ab-c1za3-8a1-8a-1a01212 | 21 | 5 |
We repeat the Data Step using NAME in SASHELP.CLASS as:
data want; set sashelp.class(keep = name); loc = 1; do order = 1 by 1 while(loc); loc = findpatnext(name, 'a'); if loc then output; end; run;
The output Data Set is:
Obs Name loc order 1 Alfred 1 1 2 Alice 1 1 3 Barbara 2 1 4 Barbara 5 2 5 Barbara 7 3 6 Carol 2 1 7 James 2 1 8 Jane 2 1 9 Janet 2 1 10 Mary 2 1 11 Ronald 4 1 12 Thomas 5 1 13 William 6 1
Edited:
In case, the n-th occurrence of 'a' is needed, the do-loop in the Data Step can be modified. Suppose the 3-rd occurrence is needed then the following Data Step can be used:
data want; length str $26; str = '098ab-c1za3-8a1-8a-1a01212'; loc = 1; do order = 1 by 1 while(loc); loc = findpatnext(str, 'a'); if order = 3 then do; output; leave; end; end; run;
A classic use of the Call PRXNEXT Routine
data have;
input string :$50.;
datalines;
098ab-c1za3-8a1-8a-1a01212
;run;
data want(keep=string pos);
set have;
RegExID = prxparse('/a/');
start=1;
call prxnext(RegExID, start, length(string), string, pos, length);
do while (pos > 0);
output;
call prxnext(RegExID, start, length(string), string, pos, length);
end;
run;
data have;
input string :$50.;
datalines;
098ab-c1za3-8a1-8a-1a01212
;run;
data want;
set have;
do i=1 to lengthn(string);
if char(string,i)='a' then do;position=i;output;end;
end;
run;
Thanks a lot.
Very easy to follow code.
HHC
It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.