email addresses error

Grandhi4 · Posted 05-10-2012 10:21 AM

Hi,

I have some two million records (email id's) in that some of wrong email addresses like ('@' missing, '.com' missing, '.net' missing,...) and each email address their own character length...so now my question is 1) How to identify the 'error' email id's ?

2) How to delete 'error' email id's ?

3) How to make a two different data sets for 'error ones' and 'non errors' ?

any one can please help the logic (code).

Thanks,

Suresh

RickM · Posted 05-10-2012 11:31 AM

For finding valid adresses I think perl regular expressions functions would be a good way (prxparse, prxmatch).

You can output data to different datasets within the same data step.

data A B;

set C;

if condition then output A;

else output B;
run;

Good luck!

FriedEgg · Posted 05-10-2012 03:08 PM

data eml;
input eml $20.;
cards;
[email protected]
[email protected]
b [email protected]
notanemail.com
;
run;

data good bad;
set eml;
if ^prxmatch('/^\w[\w\.\-]*\w\@\w[\w\.\-]*\w(\.\w{2,4})$/',strip(eml)) then output bad;
  else output good;
run;

email addresses error

Re: email addresses error

Re: email addresses error

Catch up on SAS Innovate 2026

email addresses error

Re: email addresses error

Re: email addresses error

Catch up on SAS Innovate 2026

SAS Training: Just a Click Away