Enterprise Guide 7.12
Hi
I am trying to find a number in a text field and extract it to it's own column. I have a column that contains text and anywhere in that field I may have a number like 368295194613104679. The number is always 18 characters long and currently starts with 368.
I can find all the records that have that number by using LIKE '368%' now I need to extract it so it does show for example 368295194613104679, in a new column.
I am coding this with Proc SQL.
Any ideas would be greatly appreciated.
This is a case where PRX (Pearl regular expressions) look right:
data have;
text='dafgjakgkgakga368123456789012345fagagagaga';
run;
data want;
set have;
length number $18;
pos=prxmatch('/368\d{15}/',text);
if pos;
number=substr(text,pos,18);
run;
If you want to do it in SQL, I would suggest something like this:
proc sql;
create table want as select text,substr(text,pos,18) as number length=18
from(select text,prxmatch('/368\d{15}/',text) as pos from have)
where pos>0;
quit;
The advantage to using PRX is that you can make sure that you actually have 18 digits ("368\d{15}" means "368" plus 15 digits. You say that the number currently starts with "368", if what you really want is any 18 digits, just change the PRXMATCH first parameter to "/\d{18}/".
You want to use the ANYDIGIT() function, which finds the location in the string of the first digit that it finds.
Then, you can extract the string via
substr(string,anydigit(string),18) as numbers
Thanks, I'll try that solution. I was able to get this to work
SUBSTR(t2.NOTES,FIND(t2.NOTES,"368"),18)
There are several methods to extract the number, as shown in next code:
data want;
set have;
len =18;
/*1*/ var1 = substr(string, index(string,'368),len);
/*2*/ var2 = compress(string,,'kd');
/*3*/ var3 = substr(string,(find(string,'368'),len);
put var1= var2= var3=;
run;
This is a case where PRX (Pearl regular expressions) look right:
data have;
text='dafgjakgkgakga368123456789012345fagagagaga';
run;
data want;
set have;
length number $18;
pos=prxmatch('/368\d{15}/',text);
if pos;
number=substr(text,pos,18);
run;
If you want to do it in SQL, I would suggest something like this:
proc sql;
create table want as select text,substr(text,pos,18) as number length=18
from(select text,prxmatch('/368\d{15}/',text) as pos from have)
where pos>0;
quit;
The advantage to using PRX is that you can make sure that you actually have 18 digits ("368\d{15}" means "368" plus 15 digits. You say that the number currently starts with "368", if what you really want is any 18 digits, just change the PRXMATCH first parameter to "/\d{18}/".
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.