BookmarkSubscribeRSS Feed
Not applicable
Hi All,

I have a table containing telephone number fields which is set as character value.

We expect to get telephone numbers in a set format e.g. As a valid UK landline or mobile number.

Instead we seem to also get strings like HO, HOME, WORK, EX DIR etc.

Now I want to scan through all my telephone values to find how many of them have HO, HOME, WORK or EX DIR?

Can anyone help please.

Thanks a lot in advance,
Calcite | Level 5
You can use the find function to find the substring. Check Language Reference Dictionary for the syntax for this. And you can output the results to 2 datasets if you want to view the records. E.g. something like

data strange_numbers (keep = phone_number) ok_numbers(keep = phone_number);
set mydataset;
if (find(phone_number, "HOM") > 0) or (find(teamid, "WORK") then output strange_numbers;
output ok_numbers;

Or else you could use the retain statement to just get a running count of records that match and just output that as a total number.

If you want to get a bit more sophisticated you can use the prxparse functions to find a pattern in the data. This is more flexible. You could search for any letters for example or many different types of patterns.
If all you want to do is find out is about the whole universe of text that people have added to the phone field, I'd be very tempted to compress out all the digits and punctuation from the phone number and then just run a proc freq on the remaining text. If what you want to do is REMOVE all the ancillary text, then you could compress that out as well, creating a new variable. Both techniques are shown below.

** first, make some test data;
data fone_info;
length name $10 phone $25;
infile datalines dsd;
input name $ phone $;
alan, "123 456 7890 home"
barbara, "555-634-5789 h"
clark, "06-53931302 work"
dave, "236-6132 chicago"
ed, "101-853-5937 Land"
frank, "mobile 11-2345789"
gail, "05-2583536 ext 12"
harry, "98-9783412 x 44"
ida , "97-3335643"
jenny, "101-867-5329 cell"
kathy, "96-2349224"
lawrence, "123-555-2323"
mark, "545-893-5436"
ned, "66-1234567"
oscar, "444-333-4343"
paula, "654 222 5436 home"
quentin, "777-634-8754 h"
ross, "77-5468902 work"
stephen, "43-56498321 ext dir 33"
terri, "342-6132 chicago"
ula, "555-865-2647 Land"
victor, "mobile 22-2387389"
winifred, "93-43921301 work"
xavier, "528-6132 atlanta"
yvonne, "984-333-5935 Land"
zach, "mobile 22-2312126"

** next, compress out the digits, making the text_w_phone variable;
** and compress out the alpha characters, making the phone_only variable;
data count_text;
set fone_info;
phone = upcase(phone);
text_w_phone = left(compress(phone,,'dp'));
phone_only = left(compress(phone,,'A'));

title; footnote;
ods listing;
options nodate nonumber nocenter;

** Proc Freq step to see the whole universe of text strings with phone numbers;
proc freq data=count_text;
title 'get rid of digits and see what is left';
tables text_w_phone/missing;

** proc print to see result of compress function;
proc print data=count_text;
var name phone_only text_w_phone phone;
Not applicable
Thanks a lot IrishGuy and Cynthia@SAS, I will try both the method and see which one gives me the best results.



Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.


Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 3 replies
  • 3 in conversation