In the code below, I'm trying to get yr='2018'. Instead it is blank. What am I doing wrong? Using SAS 9.4.
data out;
retain prx_yr;
if _n_=1 then do;
prx_yr=prxparse('/20\d\d/');
end;
text='dir 2018 - Subject';
p=prxmatch(prx_yr,text);
if prxmatch(prx_yr,text) then yr=prxposn(prx_yr,1,text);
output;
run;
For such a simple search just use SUBSTR().
228 data test; 229 input text $30.; 230 position=prxmatch('/20\d\d/',text); 231 if position then yr=substr(text,position,4); 232 put yr= text=; 233 cards; yr=2018 text=dir 2018 - Subject yr= text=junk
The moment you enclose your expression into parentheses things will start to work.
With more recent SAS9 versions (forgot when that got introduced) you also don't need anymore to explicitly retain the variable that holds the pointer to the compiled RegEx and you don't need to wrap the prxparse() into a if _n_=1 ... condition. The compiler handles this now for you.
data have;
infile datalines truncover;
input text $40.;
datalines;
dir 2018 - Subject
dir 34 2018 - Subject
dir 9202099 2018 - Subject
;
data want;
set have;
length yr $4;
prx_yr=prxparse('/\b(20\d\d)\b/');
if prxmatch(prx_yr,text) then yr=prxposn(prx_yr, 1, text);
run;
proc print data=want;
run;
In above table prx_yr is always 1 which shows you that with above syntax the regex gets only compiled once. If it would get compiled for each row then the number (pointer to the compiled regex) would be different in each row.
Thanks!
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.