Hi,
I want to find position for each letter "a" in a series.
Is there any simple way like find(string,'a'2)?
Thank you,
HHCFX
data have;
input string :$50.;
datalines;
098ab-c1za3-8a1-8a-1a01212
;run;
data want;set have;
first_a=find(string,'a');
second_a=find(string,'a',2);
run;
data have;
input string :$50.;
datalines;
098ab-c1za3-8a1-8a-1a01212
;run;
data want;
set have;
do i=1 to lengthn(string);
if char(string,i)='a' then do;position=i;output;end;
end;
run;
data have;
input string :$50.;
datalines;
098ab-c1za3-8a1-8a-1a01212
;run;
data want;
set have;
a=0;
do pos=findc(string,'a') by 0 while (pos);
a+1;
output;
pos= findc(string,'a',pos+1) ;
end;
run;
novinosrin has provided the idea and I build further on his.
hhchenfx is in looking for a SAS function to provide the position and order of occurrence of 'a' in a string. I don't find a readymade SAS function for his desire. However, we can build a function using the SAS function compiler (FCMP). Here is a function:
proc fcmp outlib = work.cmput.str;
function findpatnext(str $, pat $);
file log;
static lastpos 1;
pos = findc(str, pat, 'it', lastpos); /* edit: added semicolon */
lastpos = pos + 1;
return(pos);
endsub;
quit;
In a Data Step we can call the function as:
options cmplib = work.cmput;
data want;
length str $26;
str = '098ab-c1za3-8a1-8a-1a01212';
loc = 1;
do order = 1 by 1 while(loc);
loc = findpatnext(str, 'a');
if loc then output;
end;
run;
The output Data Set, WANT:
| Obs | str | loc | order |
|---|---|---|---|
| 1 | 098ab-c1za3-8a1-8a-1a01212 | 4 | 1 |
| 2 | 098ab-c1za3-8a1-8a-1a01212 | 10 | 2 |
| 3 | 098ab-c1za3-8a1-8a-1a01212 | 14 | 3 |
| 4 | 098ab-c1za3-8a1-8a-1a01212 | 18 | 4 |
| 5 | 098ab-c1za3-8a1-8a-1a01212 | 21 | 5 |
We repeat the Data Step using NAME in SASHELP.CLASS as:
data want;
set sashelp.class(keep = name);
loc = 1;
do order = 1 by 1 while(loc);
loc = findpatnext(name, 'a');
if loc then output;
end;
run;
The output Data Set is:
Obs Name loc order 1 Alfred 1 1 2 Alice 1 1 3 Barbara 2 1 4 Barbara 5 2 5 Barbara 7 3 6 Carol 2 1 7 James 2 1 8 Jane 2 1 9 Janet 2 1 10 Mary 2 1 11 Ronald 4 1 12 Thomas 5 1 13 William 6 1
Edited:
In case, the n-th occurrence of 'a' is needed, the do-loop in the Data Step can be modified. Suppose the 3-rd occurrence is needed then the following Data Step can be used:
data want;
length str $26;
str = '098ab-c1za3-8a1-8a-1a01212';
loc = 1;
do order = 1 by 1 while(loc);
loc = findpatnext(str, 'a');
if order = 3 then do; output; leave; end;
end;
run;
A classic use of the Call PRXNEXT Routine
data have;
input string :$50.;
datalines;
098ab-c1za3-8a1-8a-1a01212
;run;
data want(keep=string pos);
set have;
RegExID = prxparse('/a/');
start=1;
call prxnext(RegExID, start, length(string), string, pos, length);
do while (pos > 0);
output;
call prxnext(RegExID, start, length(string), string, pos, length);
end;
run;
data have;
input string :$50.;
datalines;
098ab-c1za3-8a1-8a-1a01212
;run;
data want;
set have;
do i=1 to lengthn(string);
if char(string,i)='a' then do;position=i;output;end;
end;
run;
Thanks a lot.
Very easy to follow code.
HHC
April 27 – 30 | Gaylord Texan | Grapevine, Texas
Walk in ready to learn. Walk out ready to deliver. This is the data and AI conference you can't afford to miss.
Register now and lock in 2025 pricing—just $495!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.