Hi Everyone,
I would like to find position of (the start) a word in my text variables.
So the start of the first myword is 3, the start of the second myword is 10.
Can you please help me with that?
Thank you.
HHCFX
<"myword myword: []/'\":+!@#$%& -0*&^% myword><W15ySpnsrCW1sA5ZZ0</W1urvA5ySpnsrCW1sA5ZZD><W1:RA5prozzznmUnzzozMA5ozW1dW1ozW1><W1:MA5oZozA5m></W1:MA5ozW1DW1ozW1Mrup><W1:MA5ozW1DW1ozW1Mrup MA5ozW1DW1ozW1MrupNW1mA5="A5CW1SA5_CS_MW1"><W1W1ozW1ZZ A5lA5mA5nozNW1mA5="WRKLW1D_OZYPA5_CDA5">myword</W1W1ozW1ZZoA5m><W1W1ozW1ZZozA5m A5lA5mA5nozNW1mA5="WRKLW1D_P_OZYPA5_CDA5"></1W1ozW1ZZozA5m></W1:MA5ozW1DW1ozW1Mrup></W1:RA5prozzznmUnzzozMA5ozW1dW1ozW1><WMW1zzlzznmW1ddrAW1ddr1>20OZ 11W1</W1:W1ddr1><W1:W1ddr3>NW1RRW1MW1NSA5OZOZ, RZZ -</W1:W1ddr3>myword
data hh;
infile datalines dlm="|";
length var $ 30000;
input var $;
datalines;
<"myword myword: []/'\":+!@#$%& -0*&^% myword><W15ySpnsrCW1sA5ZZ0</W1urvA5ySpnsrCW1sA5ZZD><W1:RA5prozzznmUnzzozMA5ozW1dW1ozW1><W1:MA5oZozA5m></W1:MA5ozW1DW1ozW1Mrup><W1:MA5ozW1DW1ozW1Mrup MA5ozW1DW1ozW1MrupNW1mA5="A5CW1SA5_CS_MW1"><W1W1ozW1ZZ A5lA5mA5nozNW1mA5="WRKLW1D_OZYPA5_CDA5">myword</W1W1ozW1ZZoA5m><W1W1ozW1ZZozA5m A5lA5mA5nozNW1mA5="WRKLW1D_P_OZYPA5_CDA5"></1W1ozW1ZZozA5m></W1:MA5ozW1DW1ozW1Mrup></W1:RA5prozzznmUnzzozMA5ozW1dW1ozW1><WMW1zzlzznmW1ddrAW1ddr1>20OZ 11W1</W1:W1ddr1><W1:W1ddr3>NW1RRW1MW1NSA5OZOZ, RZZ -</W1:W1ddr3>myword;
run;
data want;
set hh;
i = 1;
p = find(var,'myword',i,'I');
do while(p gt 0);
output;
i = p+1;
p = find(var,'myword',i,'I');
end;
run;
Here is one way using a regular expression:
data hh;
infile datalines4 dlm="|";
length var $ 30000;
input var :;
datalines;
<"myword myword: []/'\":+!@#$%& -0*&^% myword><W15ySpnsrCW1sA5ZZ0</W1urvA5ySpnsrCW1sA5ZZD><W1:RA5prozzznmUnzzozMA5ozW1dW1ozW1><W1:MA5oZozA5m></W1:MA5ozW1DW1ozW1Mrup><W1:MA5ozW1DW1ozW1Mrup MA5ozW1DW1ozW1MrupNW1mA5="A5CW1SA5_CS_MW1"><W1W1ozW1ZZ A5lA5mA5nozNW1mA5="WRKLW1D_OZYPA5_CDA5">myword</W1W1ozW1ZZoA5m><W1W1ozW1ZZozA5m A5lA5mA5nozNW1mA5="WRKLW1D_P_OZYPA5_CDA5"></1W1ozW1ZZozA5m></W1:MA5ozW1DW1ozW1Mrup></W1:RA5prozzznmUnzzozMA5ozW1dW1ozW1><WMW1zzlzznmW1ddrAW1ddr1>20OZ 11W1</W1:W1ddr1><W1:W1ddr3>NW1RRW1MW1NSA5OZOZ, RZZ -</W1:W1ddr3>myword
;;;;
run;
data want;
set hh;
length positions $200;
prxData=prxParse('/myword/i');
start=1;
call missing(positions);
do _N_=1 to 30000;
call prxNext(prxData,start,-1,var,pos,len);
if len=0 then leave;
else positions=catx(',',positions,pos);
end;
drop start pos len;
run;
Art, CEO, AnalystFinder.com
Thank you for your solution.
I try this one below, it kinds of work.
There are 2 things:
- it miss the last "myword"
- I try to put: i=position argument to get it jump. But it doesn't work.
Can anyone fix it for me?
Thanks a lot.
HHCFX
data hh;
infile datalines4 dlm="|";
length var $ 30000;
input var :;
datalines;
<"myword myword: []/'\":+!@#$%& -0*&^% myword><W15ySpnsrCW1sA5ZZ0</W1urvA5ySpnsrCW1sA5ZZD><W1:RA5prozzznmUnzzozMA5ozW1dW1ozW1><W1:MA5oZozA5m></W1:MA5ozW1DW1ozW1Mrup><W1:MA5ozW1DW1ozW1Mrup MA5ozW1DW1ozW1MrupNW1mA5="A5CW1SA5_CS_MW1"><W1W1ozW1ZZ A5lA5mA5nozNW1mA5="WRKLW1D_OZYPA5_CDA5">myword</W1W1ozW1ZZoA5m><W1W1ozW1ZZozA5m A5lA5mA5nozNW1mA5="WRKLW1D_P_OZYPA5_CDA5"></1W1ozW1ZZozA5m></W1:MA5ozW1DW1ozW1Mrup></W1:RA5prozzznmUnzzozMA5ozW1dW1ozW1><WMW1zzlzznmW1ddrAW1ddr1>20OZ 11W1</W1:W1ddr1><W1:W1ddr3>NW1RRW1MW1NSA5OZOZ, RZZ -</W1:W1ddr3>myword
;;;;
run;
data want;
set hh;
keep position;
do i=1 to 200;
if findc(var,'myword')>0 then do;
position=find(var,'myword',i);
output;
end;
end;
run;
PROC SORT nodupkey data= want out=want2;
by position;
run;
You're not searching the entire string! Try:
data want;
set hh;
keep position;
do i=1 to length(var);
if findc(var,'myword')>0 then do;
position=find(var,'myword',i);
output;
end;
end;
run;
Art, CEO, AnalystFinder.com
data want;
set hh;
i = 1;
p = find(var,'myword',i,'I');
do while(p gt 0);
output;
i = p+1;
p = find(var,'myword',i,'I');
end;
run;
Arthur.T point the right direction.
data hh;
infile datalines dlm="|";
length var $ 30000;
input var $;
datalines4;
<"myword myword: []/'\":+!@#$%& -0*&^% myword><W15ySpnsrCW1sA5ZZ0</W1urvA5ySpnsrCW1sA5ZZD><W1:RA5prozzznmUnzzozMA5ozW1dW1ozW1><W1:MA5oZozA5m></W1:MA5ozW1DW1ozW1Mrup><W1:MA5ozW1DW1ozW1Mrup MA5ozW1DW1ozW1MrupNW1mA5="A5CW1SA5_CS_MW1"><W1W1ozW1ZZ A5lA5mA5nozNW1mA5="WRKLW1D_OZYPA5_CDA5">myword</W1W1ozW1ZZoA5m><W1W1ozW1ZZozA5m A5lA5mA5nozNW1mA5="WRKLW1D_P_OZYPA5_CDA5"></1W1ozW1ZZozA5m></W1:MA5ozW1DW1ozW1Mrup></W1:RA5prozzznmUnzzozMA5ozW1dW1ozW1><WMW1zzlzznmW1ddrAW1ddr1>20OZ 11W1</W1:W1ddr1><W1:W1ddr3>NW1RRW1MW1NSA5OZOZ, RZZ -</W1:W1ddr3>myword;
;;;;
run;
data _null_;
set hh;
pid=prxparse('/myword/');
s=1;
e=length(var);
call prxnext(pid,s,e,var,p,l);
do i=1 by 1while(p>0);
put i= p=;
call prxnext(pid,s,e,var,p,l);
end;
run;
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.
SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.
Find more tutorials on the SAS Users YouTube channel.