I am new to using Regex expressions using SAS and trying to parse out the a phone number from a string of text. I read through this document and also this one. This is what I have so far, and I'm not getting the result I'm looking for:
DATA in_string;
INPUT @1 STRING $50.;
DATALINES; Here is a phone number 111-123-4567 other type of number:TT44679 111-234-6583 number is (231)-390-5710 this has two: 123-523-4545 222-333-4444
;
data out_string ;
set in_string;
if _n_ = 1 then do;
phone_prx= prxparse("/ \(?(\d\d\d).(\d\d\d).(\d\d\d\d)/");
end;
retain phone_prx ;
pos_phone= prxmatch(phone_prx,string);
call prxposn(phone_prx, 1, areacode_pos);
call prxposn(phone_prx, 2, phone1_pos);
call prxposn(phone_prx, 3, phone2_pos);
length phonenumber $15;
phonenumber= substrn(string, areacode_pos, 3) || "-" ||
substrn(string, phone1_pos, 3) || "-" ||
substrn(string,phone2_pos, 4);
run;
The output I would like is:
obs phonenumber
1 111-123-4567
2 111-234-6583
3 (231)-390-5710
4 123-523-4545 222-333-4444
**I've tried editing the text to make the code more clear, but some spacing isn't working.
... View more