BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Shivi82
Quartz | Level 8

Dear Team,

 

I have working on a data which has 1 million observations.

This data contains the phone numbers of our customers.

During the booking process the resources enter the phone numbers of our clients however there are many instances where an incorrect phone number has been entered lets says "9999999999". <this is an india format with 10 digits and no country code>.

 

I have run proc freq on this data and can see there is approx 19.5% of the data which is incorrect i.e. in the format like "9999999999", "1111111111" etc which are written in varuous ways.

I need to delete this data and only keep the valid phone numbers.

 

running if delete statement would have done the trick however as i mentioned the ph number is written in various ways (all numeric).

 

Kindly suggest an easy way if possible. I am working on SAS developer.

 

Regards, Shivi

1 ACCEPTED SOLUTION

Accepted Solutions
RW9
Diamond | Level 26 RW9
Diamond | Level 26

Ok, the easy way:

Write down what a valid phone number consists of, for example:

- Comprises of 9 numbers

- All numeric

- Prefixed by 001

 

Write your datastep logic to match this:

data want;
  set have;
  select;
    case (length(phone)>9) delete;
    case (compress(phone," ","d")) delete;
    case (substr(phone,1,3) ne "001") delete;
    otherwise;
  end;
run;

View solution in original post

3 REPLIES 3
FreelanceReinh
Jade | Level 19

Hi Shivi,

 

There is a parallel thread about almost the same topic: https://communities.sas.com/t5/Base-SAS-Programming/Processing-fields-based-on-value-s/m-p/237105#M4.... Part of the recommendations given there may be applicable to your case.

Shivi82
Quartz | Level 8
Thanks for sharing the link but this has made me go bonkers. The article shared in the link was enormous and very difficult to understand as not really an expert on sas macros.
RW9
Diamond | Level 26 RW9
Diamond | Level 26

Ok, the easy way:

Write down what a valid phone number consists of, for example:

- Comprises of 9 numbers

- All numeric

- Prefixed by 001

 

Write your datastep logic to match this:

data want;
  set have;
  select;
    case (length(phone)>9) delete;
    case (compress(phone," ","d")) delete;
    case (substr(phone,1,3) ne "001") delete;
    otherwise;
  end;
run;

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 3 replies
  • 1062 views
  • 1 like
  • 3 in conversation