Hi,
I want to find position for each letter "a" in a series.
Is there any simple way like find(string,'a'2)?
Thank you,
HHCFX
data have;
input string :$50.;
datalines;
098ab-c1za3-8a1-8a-1a01212
;run;
data want;set have;
first_a=find(string,'a');
second_a=find(string,'a',2);
run;
data have;
input string :$50.;
datalines;
098ab-c1za3-8a1-8a-1a01212
;run;
data want;
set have;
do i=1 to lengthn(string);
if char(string,i)='a' then do;position=i;output;end;
end;
run;
data have;
input string :$50.;
datalines;
098ab-c1za3-8a1-8a-1a01212
;run;
data want;
set have;
a=0;
do pos=findc(string,'a') by 0 while (pos);
a+1;
output;
pos= findc(string,'a',pos+1) ;
end;
run;
novinosrin has provided the idea and I build further on his.
hhchenfx is in looking for a SAS function to provide the position and order of occurrence of 'a' in a string. I don't find a readymade SAS function for his desire. However, we can build a function using the SAS function compiler (FCMP). Here is a function:
proc fcmp outlib = work.cmput.str; function findpatnext(str $, pat $); file log; static lastpos 1; pos = findc(str, pat, 'it', lastpos); /* edit: added semicolon */ lastpos = pos + 1; return(pos); endsub; quit;
In a Data Step we can call the function as:
options cmplib = work.cmput; data want; length str $26; str = '098ab-c1za3-8a1-8a-1a01212'; loc = 1; do order = 1 by 1 while(loc); loc = findpatnext(str, 'a'); if loc then output; end; run;
The output Data Set, WANT:
Obs | str | loc | order |
---|---|---|---|
1 | 098ab-c1za3-8a1-8a-1a01212 | 4 | 1 |
2 | 098ab-c1za3-8a1-8a-1a01212 | 10 | 2 |
3 | 098ab-c1za3-8a1-8a-1a01212 | 14 | 3 |
4 | 098ab-c1za3-8a1-8a-1a01212 | 18 | 4 |
5 | 098ab-c1za3-8a1-8a-1a01212 | 21 | 5 |
We repeat the Data Step using NAME in SASHELP.CLASS as:
data want; set sashelp.class(keep = name); loc = 1; do order = 1 by 1 while(loc); loc = findpatnext(name, 'a'); if loc then output; end; run;
The output Data Set is:
Obs Name loc order 1 Alfred 1 1 2 Alice 1 1 3 Barbara 2 1 4 Barbara 5 2 5 Barbara 7 3 6 Carol 2 1 7 James 2 1 8 Jane 2 1 9 Janet 2 1 10 Mary 2 1 11 Ronald 4 1 12 Thomas 5 1 13 William 6 1
Edited:
In case, the n-th occurrence of 'a' is needed, the do-loop in the Data Step can be modified. Suppose the 3-rd occurrence is needed then the following Data Step can be used:
data want; length str $26; str = '098ab-c1za3-8a1-8a-1a01212'; loc = 1; do order = 1 by 1 while(loc); loc = findpatnext(str, 'a'); if order = 3 then do; output; leave; end; end; run;
A classic use of the Call PRXNEXT Routine
data have;
input string :$50.;
datalines;
098ab-c1za3-8a1-8a-1a01212
;run;
data want(keep=string pos);
set have;
RegExID = prxparse('/a/');
start=1;
call prxnext(RegExID, start, length(string), string, pos, length);
do while (pos > 0);
output;
call prxnext(RegExID, start, length(string), string, pos, length);
end;
run;
data have;
input string :$50.;
datalines;
098ab-c1za3-8a1-8a-1a01212
;run;
data want;
set have;
do i=1 to lengthn(string);
if char(string,i)='a' then do;position=i;output;end;
end;
run;
Thanks a lot.
Very easy to follow code.
HHC
Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.