- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
I have data sets that have mixed formats of phone numbers such as:
+1 (915) 111-1111
+ (202) 509-3114
+1 (915) 1234567
+ (202) 5093114
(777) 123-4567
3334445678
123.222.1334
How can I standardize the numbers into the following format:
717-333-4545
Thank you
Accepted Solutions
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Another task for one of my favourite functions: prxchange:
data have;
input phoneNumber $20.;
datalines;
+1 (915) 111-1111
+ (202) 509-3114
+1 (915) 1234567
+ (202) 5093114
(777) 123-4567
3334445678
123.222.1334
;
data want;
set have;
phoneNumber = prxchange('s/.*(\d{3})\D*(\d{3})\D*(\d{4})/$1-$2-$3/', -1, phoneNumber);
run;
.* matches anything
(\d{3}) = three digits, the brackets create a capture-group
\D* = zero or more non-digits
$1 = the first capture-group
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
data test;
col1 = "+1 (915) 111-1111";
output;
col1 = "+ (202) 509-3114";
output;
col1 = "+1 (915) 1234567";
output;
col1 = "+ (202) 5093114";
output;
col1 = "(777) 123-4567";
output;
col1 = "3334445678";
output;
col1 = "123.222.1334";
output;
run;
data want;
set test;
col2 = tranwrd (col1,"+1","");
col3 = compress(col2, '(+. -)', 'a');
col4 = catx("-",substr(col3,1,3), substr(col3,4,3), substr(col3,7,4));
run;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Another task for one of my favourite functions: prxchange:
data have;
input phoneNumber $20.;
datalines;
+1 (915) 111-1111
+ (202) 509-3114
+1 (915) 1234567
+ (202) 5093114
(777) 123-4567
3334445678
123.222.1334
;
data want;
set have;
phoneNumber = prxchange('s/.*(\d{3})\D*(\d{3})\D*(\d{4})/$1-$2-$3/', -1, phoneNumber);
run;
.* matches anything
(\d{3}) = three digits, the brackets create a capture-group
\D* = zero or more non-digits
$1 = the first capture-group
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
This actually worked thank you so much
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Do you have any international numbers? These may show up as 2 digits in front of other digits and not in a typical 3 digit area code , 3 digit exchange and 4 digit block.
Learned this the hard way with code similar to @andreas_lds used on some international numbers in the middle of data that was mostly US that moved them from Central America to Colorado because the local phone number was 2 digits shorter than than the US and the international code provided just enough to replace them and make something that looked like a US phone number.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Make sure you deal correctly with
+43699555123445
Simply discarding the global country code will get you in trouble.