BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Selli5
Fluorite | Level 6

Hello,

 

I need to extract the phone numbers out of a text field to establish a new cleaned contact mobile number variable and contact landline variable.  There is no set order the phone numbers are written in the text field. Attached is a sample file.  Can anyone help?

 

thanks

Sally

1 ACCEPTED SOLUTION

Accepted Solutions
Madansas7b
Obsidian | Level 7

as of my understand to keep only numbers there are several ways .and you can use like bellow

 

num = compress(long_string, , 'kd');

View solution in original post

4 REPLIES 4
Madansas7b
Obsidian | Level 7

as of my understand to keep only numbers there are several ways .and you can use like bellow

 

num = compress(long_string, , 'kd');

Patrick
Opal | Level 21

The way I would approach this task:

 

1. Data exploration and profiling

Scan through your data and collect as many different patterns of phone numbers as you can, i.e. string of 10 digits, first 2 digits 04 

 

2. Define the business rules

Define the extraction rules for your phone numbers, i.e. string with 10 digits, first 2 digits 04 -> mobile number

 

3. Define sequence how to apply business rules

Example: if substring of digits points to landline number but there is also the word "Mobile" in the string: What comes first? Will this be qualified as a landline or a mobile number? Eventually plan also for field which contains data quality score which in such a case wouldn't be 1

 

4. Implement

Now that you've got the business rules and sequence of rules you can implement via a data step and a set of IF.. THEN.. ELSE statements.

 

5. Test

A: Run your code against the sample data containing all the cases. Verify if result is as expected/defined in the business rules.

B: Run your code agains the full data set. Check log for any signs of issues, check that for all source string a phone number could get extracted (or that there is a clear explanation why this didn't happen).

 

 

Once you get stuck in step 4. on how to technically implement a business rule then come back to this forum, provide a data step which creates sample data with the source string, provide the business rule, the not yet working code you've already developed and show us how the expected result should look like when applied on the sample data you've provided.

 

lakshmi_74
Quartz | Level 8
From your dataset i am seeing the same number with the text of specific category('land','mobile) number in the column of current_phone. Can you let us know what extactly you are looking for the output?
Selli5
Fluorite | Level 6

Hi,

I have a current phone variable which has mixed text with the phone numbers.  I would like to create a new phone variable that just had the phone numbers.  

Secondly I would like all mobile numbers beginning with (4 or 04) to All begin with 04.  
And numbers beginning with 08 or 8 or 9 with a state = WA to All begin 08. and so on for all states.

CURRENT PHONE    STATE
0418448841MOBILE    WA
FAX90227506    WA
CAR0L'S MOBILE 0409211082    WA
890213600    WA
90214178    WA
427866574    WA
(HOME90930658)    NSW
418934400    WA
0439394323/DESMOND    WA

thanks
Sally

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 4 replies
  • 1198 views
  • 1 like
  • 4 in conversation