Dear Team,
I have working on a data which has 1 million observations.
This data contains the phone numbers of our customers.
During the booking process the resources enter the phone numbers of our clients however there are many instances where an incorrect phone number has been entered lets says "9999999999". <this is an india format with 10 digits and no country code>.
I have run proc freq on this data and can see there is approx 19.5% of the data which is incorrect i.e. in the format like "9999999999", "1111111111" etc which are written in varuous ways.
I need to delete this data and only keep the valid phone numbers.
running if delete statement would have done the trick however as i mentioned the ph number is written in various ways (all numeric).
Kindly suggest an easy way if possible. I am working on SAS developer.
Regards, Shivi
Ok, the easy way:
Write down what a valid phone number consists of, for example:
- Comprises of 9 numbers
- All numeric
- Prefixed by 001
Write your datastep logic to match this:
data want; set have; select; case (length(phone)>9) delete; case (compress(phone," ","d")) delete; case (substr(phone,1,3) ne "001") delete; otherwise; end; run;
Hi Shivi,
There is a parallel thread about almost the same topic: https://communities.sas.com/t5/Base-SAS-Programming/Processing-fields-based-on-value-s/m-p/237105#M4.... Part of the recommendations given there may be applicable to your case.
Ok, the easy way:
Write down what a valid phone number consists of, for example:
- Comprises of 9 numbers
- All numeric
- Prefixed by 001
Write your datastep logic to match this:
data want; set have; select; case (length(phone)>9) delete; case (compress(phone," ","d")) delete; case (substr(phone,1,3) ne "001") delete; otherwise; end; run;
Don't miss out on SAS Innovate - Register now for the FREE Livestream!
Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.