BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
BrahmanandaRao
Lapis Lazuli | Level 10
data phone_number;
input Phone_Number $ 15.;
standard_phone = catx('-', SUBSTR(PUT(phone_number, Z10.), 1, 3), SUBSTR(PUT(phone_number, Z10.), 4, 2), SUBSTR(PUT(phone_number, Z10.), 6, 4));
datalines;
123-545-5421
1236439775
7066950392
123-543-2345
8766783469
304-762-2467
 
8766783469
Na
123-545-5421
 
7066950392
123-543-2345
8766783469
304-762-2467
123-545-5421
1236439775
7066950392
Na
8766783469
8766783469
;
run;

Here i want to clean the data and standard format for phone numbers in the above data

i want miniplates all  phone numbers below format  

123-543-2345

and also count of each format like below 

123-543-2345 totalcount=7

8766783469 totalcount=11 

Na total count=2

1 ACCEPTED SOLUTION

Accepted Solutions
PaigeMiller
Diamond | Level 26
data phone_number;
input Phone_Number $ 15.;
datalines;
123-545-5421
1236439775
7066950392
123-543-2345
8766783469
304-762-2467
 
8766783469
Na
123-545-5421
 
7066950392
123-543-2345
8766783469
304-762-2467
123-545-5421
1236439775
7066950392
Na
8766783469
8766783469
;

data want;
    /* Create variable named TYPE for counting purposes */
    length type $ 8;
    set phone_number;
    is_dash=find(phone_number,'-');
    if is_dash=0 and upcase(phone_number)^='NA' and not missing(phone_number) then do;
        phone_number=cats(substr(phone_number,1,3),'-',substr(phone_number,4,3),'-',substr(phone_number,7));
        type='No Dash';
    end;
    else if upcase(phone_number)='NA' then type='NA';
    else if not missing(phone_number) then type='Dash';
run;

proc freq data=want;
    tables type;
run;
--
Paige Miller

View solution in original post

7 REPLIES 7
PaigeMiller
Diamond | Level 26
data phone_number;
input Phone_Number $ 15.;
datalines;
123-545-5421
1236439775
7066950392
123-543-2345
8766783469
304-762-2467
 
8766783469
Na
123-545-5421
 
7066950392
123-543-2345
8766783469
304-762-2467
123-545-5421
1236439775
7066950392
Na
8766783469
8766783469
;

data want;
    /* Create variable named TYPE for counting purposes */
    length type $ 8;
    set phone_number;
    is_dash=find(phone_number,'-');
    if is_dash=0 and upcase(phone_number)^='NA' and not missing(phone_number) then do;
        phone_number=cats(substr(phone_number,1,3),'-',substr(phone_number,4,3),'-',substr(phone_number,7));
        type='No Dash';
    end;
    else if upcase(phone_number)='NA' then type='NA';
    else if not missing(phone_number) then type='Dash';
run;

proc freq data=want;
    tables type;
run;
--
Paige Miller
BrahmanandaRao
Lapis Lazuli | Level 10

hi Miller

Thank You for solution

why you used upcase function for phone_number ?

PaigeMiller
Diamond | Level 26

@BrahmanandaRao wrote:

hi Miller

Thank You for solution

why you used upcase function for phone_number ?


perhaps someone types it as NA one time and Na another time and na another time

--
Paige Miller
BrahmanandaRao
Lapis Lazuli | Level 10

Brilliant 🤗

ballardw
Super User

@BrahmanandaRao wrote:

hi Miller

Thank You for solution

why you used upcase function for phone_number ?


BR549

 

Maybe not as common as it used to be but there are letters associated with certain digits of phone numbers. B is 2 (along with A and C) , R is 7 (along with P Q and S). Some organizations pay big money to have a number that "spells" something useful to them.

So you may want to consider that as part of whatever purpose this has.

 

Remember, if you aren't doing arithmetic with it then it is not a number just a collection of digits.

HB
Barite | Level 11 HB
Barite | Level 11

Nobody knows Junior Samples' phone number.  

Ksharp
Super User
data phone_number;
input Phone_Number $ 15.;
datalines;
123-545-5421
1236439775
7066950392
123-543-2345
8766783469
304-762-2467
 
8766783469
Na
123-545-5421
 
7066950392
123-543-2345
8766783469
304-762-2467
123-545-5421
1236439775
7066950392
Na
8766783469
8766783469
;

proc format;
picture fmt
low-high='000-000-0000'
.=' ';
run;

data want;
 set phone_number;
 new_phone_number=coalescec(put(input(compress(phone_number,,'kd'),best32.),fmt.),Phone_Number);
run;

proc freq data=want ;
    tables new_phone_number/missing;
run;

Ksharp_0-1684929529570.png

 

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 7 replies
  • 1061 views
  • 3 likes
  • 5 in conversation