<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Finding bad email address with prxparse how do you stop multiple @ in SAS Procedures</title>
    <link>https://communities.sas.com/t5/SAS-Procedures/Finding-bad-email-address-with-prxparse-how-do-you-stop-multiple/m-p/16127#M2907</link>
    <description>Be careful with the space thing.&lt;BR /&gt;
&lt;BR /&gt;
I would use &lt;BR /&gt;
[pre]&lt;BR /&gt;
extraspace = anyspace(trim(left(email)));&lt;BR /&gt;
[/pre]&lt;BR /&gt;
&lt;BR /&gt;
You'll be surprised at how fast SAS can blow through a multi-million observation data set, even on a PC these days.&lt;BR /&gt;
&lt;BR /&gt;
And, you are welcome.</description>
    <pubDate>Wed, 07 May 2008 18:53:16 GMT</pubDate>
    <dc:creator>deleted_user</dc:creator>
    <dc:date>2008-05-07T18:53:16Z</dc:date>
    <item>
      <title>Finding bad email address with prxparse how do you stop multiple @</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/Finding-bad-email-address-with-prxparse-how-do-you-stop-multiple/m-p/16124#M2904</link>
      <description>Hi,&lt;BR /&gt;
I am working on trying to find bad email addresses in a file and i am using the following code&lt;BR /&gt;
data lines&lt;BR /&gt;
adsf@mekto.com adre@mekto.com&lt;BR /&gt;
adsf@mekto.com adr&lt;BR /&gt;
erese@mekel@sdfsd.com&lt;BR /&gt;
my@my.com&lt;BR /&gt;
23432@295.com&lt;BR /&gt;
my~@dsf.ca&lt;BR /&gt;
myse@sere&lt;BR /&gt;
@com.ec&lt;BR /&gt;
fe!@d.com&lt;BR /&gt;
adfe'j@cfe.ca&lt;BR /&gt;
adfsd@nl.jjle.ca&lt;BR /&gt;
&lt;BR /&gt;
data work.testlht3;&lt;BR /&gt;
set work.mytest;&lt;BR /&gt;
if _n_=1 then do;&lt;BR /&gt;
re= prxparse("/((\w|\.|\-)+@(\w|\.|\-))+/");&lt;BR /&gt;
end;&lt;BR /&gt;
retain re;&lt;BR /&gt;
if ^prxmatch(re,email) then LHT=0;&lt;BR /&gt;
  else LHT=1; &lt;BR /&gt;
  run;&lt;BR /&gt;
&lt;BR /&gt;
This only finds problems with my~@dsf.ca @com.ec fe!@d.com&lt;BR /&gt;
My problem is when there is email addresses with multiple @ this does not pick it up. Also if there are spaces in the email.&lt;BR /&gt;
These are the test addreses I am using that I want to pick up as wrong but I can't seem to get the correct Perl statement :&lt;BR /&gt;
adsf@mekto.com adre@mekto.com (has basically 2 addresses in the field)&lt;BR /&gt;
adsf@mekto.com adr  (has space and then text)&lt;BR /&gt;
erese@mekel@sdfsd.com (multiple @)&lt;BR /&gt;
myse@sere (no .com)&lt;BR /&gt;
adfe'j@cfe.ca  (there is a ' in this address)&lt;BR /&gt;
&lt;BR /&gt;
I have also tried the following statement but it doesn't work the way I need either. &lt;BR /&gt;
We do have valid addresses like mh@nxe.ener.ds.com not just mm@mse.com&lt;BR /&gt;
&lt;BR /&gt;
prxparse('/ \w[-.\w]*\@[-\w]+(\.[-\w]+)*\.(ca|com|edu|gov|int|mil|net|org|biz|info|name|mu­seum|coop|aero|[a-z][a-z]) /i'); &lt;BR /&gt;
&lt;BR /&gt;
Any ideas??&lt;BR /&gt;
Thanks</description>
      <pubDate>Wed, 07 May 2008 16:05:27 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/Finding-bad-email-address-with-prxparse-how-do-you-stop-multiple/m-p/16124#M2904</guid>
      <dc:creator>deleted_user</dc:creator>
      <dc:date>2008-05-07T16:05:27Z</dc:date>
    </item>
    <item>
      <title>Re: Finding bad email address with prxparse how do you stop multiple @</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/Finding-bad-email-address-with-prxparse-how-do-you-stop-multiple/m-p/16125#M2905</link>
      <description>The way to eat an elephant is one bite at a time.&lt;BR /&gt;
Don't run yourself into the ground trying to make one thing do everything.&lt;BR /&gt;
test the pattern&lt;BR /&gt;
test for multiple '@' separately&lt;BR /&gt;
test for imbedded spaces separately.&lt;BR /&gt;
&lt;BR /&gt;
As an example,&lt;BR /&gt;
When I test a textually provided date, I have a sequence of steps&lt;BR /&gt;
1) is it in a prescribed format, e.g. yyyy-mm-dd = '....-..-..'&lt;BR /&gt;
In this case&lt;BR /&gt;
2) are the fields numeric?&lt;BR /&gt;
3) is 01 LE mm LE 12&lt;BR /&gt;
4) is 01 LE dd LE 31&lt;BR /&gt;
5) for a given month, is dd within that month's proper range -- jan LT 32, apr LT 31; I have already determined it is GT 0.&lt;BR /&gt;
&lt;BR /&gt;
This simplifies the parsing, and improves my error responses to being more specific to what is wrong, as opposed to just "invalid date".&lt;BR /&gt;
&lt;BR /&gt;
To count the number of '@' that exist in a string, use either the SAS count or countc functions.&lt;BR /&gt;
&lt;BR /&gt;
You can use INDEX, INDEXC, COUNT, COUNTC or ANYSPACE to indentify spaces.  ANYSPACE identifies white space -- tab, space, carriage return.

Message was edited by: Chuck</description>
      <pubDate>Wed, 07 May 2008 16:23:09 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/Finding-bad-email-address-with-prxparse-how-do-you-stop-multiple/m-p/16125#M2905</guid>
      <dc:creator>deleted_user</dc:creator>
      <dc:date>2008-05-07T16:23:09Z</dc:date>
    </item>
    <item>
      <title>Re: Finding bad email address with prxparse how do you stop multiple @</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/Finding-bad-email-address-with-prxparse-how-do-you-stop-multiple/m-p/16126#M2906</link>
      <description>Thanks Chuck. I was just trying to be as efficent as possible as I will be doing this for millions of addresses. I took your advice and now have it doing what I need.&lt;BR /&gt;
&lt;BR /&gt;
&lt;BR /&gt;
Can anyone make it more efficent than this?&lt;BR /&gt;
&lt;BR /&gt;
data work.testlht3;&lt;BR /&gt;
set work.mytest;&lt;BR /&gt;
if _n_=1 then do;&lt;BR /&gt;
re= prxparse("/((\w|\.|\-)+@(\w|\.|\-)+\.(\w))+/");&lt;BR /&gt;
end;&lt;BR /&gt;
retain re;&lt;BR /&gt;
if ^prxmatch(re,email) then LHT=0;&lt;BR /&gt;
  else LHT=1; &lt;BR /&gt;
multiple_at=countc(email,'@');&lt;BR /&gt;
badspace=ANYSPACE(email);&lt;BR /&gt;
elength=length(email);&lt;BR /&gt;
extraspace=(elength-badspace);&lt;BR /&gt;
quotescan=index(email,"'");&lt;BR /&gt;
if (quotescan&amp;gt;0 or extraspace&amp;gt;0 or lht=0 or multiple_at ne 1) then bademail=1;&lt;BR /&gt;
else bademail=0;&lt;BR /&gt;
 run;</description>
      <pubDate>Wed, 07 May 2008 17:07:48 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/Finding-bad-email-address-with-prxparse-how-do-you-stop-multiple/m-p/16126#M2906</guid>
      <dc:creator>deleted_user</dc:creator>
      <dc:date>2008-05-07T17:07:48Z</dc:date>
    </item>
    <item>
      <title>Re: Finding bad email address with prxparse how do you stop multiple @</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/Finding-bad-email-address-with-prxparse-how-do-you-stop-multiple/m-p/16127#M2907</link>
      <description>Be careful with the space thing.&lt;BR /&gt;
&lt;BR /&gt;
I would use &lt;BR /&gt;
[pre]&lt;BR /&gt;
extraspace = anyspace(trim(left(email)));&lt;BR /&gt;
[/pre]&lt;BR /&gt;
&lt;BR /&gt;
You'll be surprised at how fast SAS can blow through a multi-million observation data set, even on a PC these days.&lt;BR /&gt;
&lt;BR /&gt;
And, you are welcome.</description>
      <pubDate>Wed, 07 May 2008 18:53:16 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/Finding-bad-email-address-with-prxparse-how-do-you-stop-multiple/m-p/16127#M2907</guid>
      <dc:creator>deleted_user</dc:creator>
      <dc:date>2008-05-07T18:53:16Z</dc:date>
    </item>
  </channel>
</rss>

