BookmarkSubscribeRSS Feed
AB1976
Calcite | Level 5

Good day I would like assistance in transforming this SQL code into a code suitable for SAS.

 

I am trying to identify email addresses that do not conform to the validation rules set out for a valid email address.

 
CASE 
WHEN [EMAIL_ADRES_X] LIKE '%_@__%.__%' AND PATINDEX('\b[a-zA-Z0-9._-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,6}\b', [EMAIL_ADRES_X]) = 0 THEN 1
ELSE 0
END AS email_conformity
FROM #T2 

 

4 REPLIES 4
quickbluefish
Barite | Level 11

I don't remember the syntax for the PRXMATCH function offhand, but you'd do something like this - just multiplying these two binary checks (>0 and =0) together to create a 0/1 variable called email_conformity.  Note that "email_address" is assumed to be a variable (column) in your input data and input data is called 'emailaddr':

 

data emailaddr;
set emailaddr;
email_conformity=(
    prxmatch(YOUR EMAIL REGEX, email_address)>0
    *
    prxmatch(YOUR OTHER REGEX, email_address)=0
    );
run;
PaigeMiller
Diamond | Level 26

It would be better if you explained the rules you want to follow, rather than providing PRX code, which I am unfamiliar with.

--
Paige Miller
Tom
Super User Tom
Super User

Are you asking how to translate the PATINDEX() function that is included as an SQL enhancement in your current SQL dialect into a SAS function? 

 

You might try PRXMATCH function.

 

Or are you asking what to do about those strange square brackets and hash mark?  For those I think you want to replace them with the actual names of your variables and dataset.

Patrick
Opal | Level 21

@AB1976 Something like below should work.

proc sql;
  select 
  CASE 
  WHEN 
      EMAIL_ADRES_X LIKE '%_@__%.__%' 
      AND prxmatch('/\b[a-zA-Z0-9\._-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,6}\b/i', strip(EMAIL_ADRES_X)) = 1 
      THEN 1
      ELSE 0
      END AS email_conformity
  FROM have
  ;
quit;

Like SQL Server patindex() SAS prxmatch() will return the starting position of a match or zero if it isn't found. 

From the looks of it your RegEx got some "issues" - especially the full stop that's not masked and though stands for any character.

 

Instead of inventing a email validation RegEx yourself I'd Google for it. Below one way how you could modify your current syntax.

data have;
  EMAIL_ADRES_X='firstname.lastname@somecompany.com'; output;
  EMAIL_ADRES_X='..@somecompany.com'; output;

run;

proc sql;
  select 
  CASE 
  WHEN 
    prxmatch('/^[a-z0-9][\w\.-]+@[\w\.-]+\.[a-z]{2,6}$/oi', strip(EMAIL_ADRES_X)) = 1 THEN 1
    ELSE 0
    END AS email_conformity
  FROM have
  ;
quit;

 

hackathon24-white-horiz.png

The 2025 SAS Hackathon Kicks Off on June 11!

Watch the live Hackathon Kickoff to get all the essential information about the SAS Hackathon—including how to join, how to participate, and expert tips for success.

YouTube LinkedIn

Mastering the WHERE Clause in PROC SQL

SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 4 replies
  • 889 views
  • 2 likes
  • 5 in conversation