Is there a more condensed way to write this? I want to know if I can do the "keep" group all in one line instead of needing to duplicate it for each variation of email domain.
CASE
WHEN t1.NVL_COUNTRY_GROUP = 'NULL' and t1.EMAIL_ADDRESS LIKE '%.COM' THEN 'KEEP'
WHEN t1.NVL_COUNTRY_GROUP = 'NULL' and t1.EMAIL_ADDRESS LIKE '%.MIL' THEN 'KEEP'
WHEN t1.NVL_COUNTRY_GROUP = 'NULL' and t1.EMAIL_ADDRESS LIKE '%.BIZ' THEN 'KEEP'
WHEN t1.NVL_COUNTRY_GROUP = 'NULL' and t1.EMAIL_ADDRESS LIKE '%.NET' THEN 'KEEP'
WHEN t1.NVL_COUNTRY_GROUP = 'NULL' and t1.EMAIL_ADDRESS LIKE '%.ORG' THEN 'KEEP'
WHEN t1.NVL_COUNTRY_GROUP = 'NULL' and t1.EMAIL_ADDRESS LIKE '%.EDU' THEN 'KEEP'
WHEN t1.NVL_COUNTRY_GROUP = 'NULL' and t1.EMAIL_ADDRESS LIKE '%.GOV' THEN 'KEEP'
WHEN t1.NVL_COUNTRY_GROUP = 'NULL' and t1.EMAIL_ADDRESS LIKE '%.PRO' THEN 'KEEP'
WHEN t1.NVL_COUNTRY_GROUP = 'NULL' and t1.EMAIL_ADDRESS LIKE '%.INFO' THEN 'KEEP'
WHEN t1.NVL_COUNTRY_GROUP = 'NULL' and t1.EMAIL_ADDRESS LIKE '%.US' THEN 'KEEP'
WHEN t1.NVL_COUNTRY_GROUP = 'NULL' and t1.EMAIL_ADDRESS LIKE '%.CA' THEN 'KEEP'
WHEN t1.NVL_COUNTRY_GROUP = 'NULL' and t1.EMAIL_ADDRESS LIKE '%.AERO' THEN 'KEEP'
WHEN t1.NVL_COUNTRY_GROUP = 'NULL' and t1.EMAIL_ADDRESS LIKE '%.ARPA' THEN 'KEEP'
WHEN t1.NVL_COUNTRY_GROUP = 'NULL' and t1.EMAIL_ADDRESS LIKE '%.ASIA' THEN 'KEEP'
WHEN t1.NVL_COUNTRY_GROUP = 'NULL' and t1.EMAIL_ADDRESS LIKE '%.CAT' THEN 'KEEP'
WHEN t1.NVL_COUNTRY_GROUP = 'NULL' and t1.EMAIL_ADDRESS LIKE '%.COOP' THEN 'KEEP'
WHEN t1.NVL_COUNTRY_GROUP = 'NULL' and t1.EMAIL_ADDRESS LIKE '%.INT' THEN 'KEEP'
WHEN t1.NVL_COUNTRY_GROUP = 'NULL' and t1.EMAIL_ADDRESS LIKE '%.JOBS' THEN 'KEEP'
WHEN t1.NVL_COUNTRY_GROUP = 'NULL' and t1.EMAIL_ADDRESS LIKE '%.MOBI' THEN 'KEEP'
WHEN t1.NVL_COUNTRY_GROUP = 'NULL' and t1.EMAIL_ADDRESS LIKE '%.MUSEUM' THEN 'KEEP'
WHEN t1.NVL_COUNTRY_GROUP = 'NULL' and t1.EMAIL_ADDRESS LIKE '%.NAME' THEN 'KEEP'
WHEN t1.NVL_COUNTRY_GROUP = 'NULL' and t1.EMAIL_ADDRESS LIKE '%.TEL' THEN 'KEEP'
WHEN t1.NVL_COUNTRY_GROUP = 'NULL' and t1.EMAIL_ADDRESS LIKE '%.TRAVEL' THEN 'KEEP'
WHEN t1.NVL_COUNTRY_GROUP = 'NULL' and t1.EMAIL_ADDRESS LIKE '%.UM' THEN 'KEEP'
WHEN t1.NVL_COUNTRY_GROUP = 'POPULATED_COUNTRY' THEN 'KEEP'
ELSE 'REMOVE'
END
case when t2.nvl_country_group = 'NULL' and upcase(scan(t1.email_address, 2, '.@')) in ('COM', 'MIL', BIZ', 'NET', .... , 'UM') then 'KEEP'
else 'REMOVE'
end;
Couple of other options:
1. Create a data set with the list of domains you want to include in the strings and use that instead of typing it into your code. It makes it a bit easier to maintain in the long run.
2. From #1 you could create a format that would assign the KEEP/REMOVE values
3. Test NULL condition only once instead of every line, nest the CASE or IF/THEN statements if necessary.
@agoldberg2018 wrote:
Is there a more condensed way to write this? I want to know if I can do the "keep" group all in one line instead of needing to duplicate it for each variation of email domain.
CASE
WHEN t1.NVL_COUNTRY_GROUP = 'NULL' and t1.EMAIL_ADDRESS LIKE '%.COM' THEN 'KEEP'
WHEN t1.NVL_COUNTRY_GROUP = 'NULL' and t1.EMAIL_ADDRESS LIKE '%.MIL' THEN 'KEEP'
WHEN t1.NVL_COUNTRY_GROUP = 'NULL' and t1.EMAIL_ADDRESS LIKE '%.BIZ' THEN 'KEEP'
WHEN t1.NVL_COUNTRY_GROUP = 'NULL' and t1.EMAIL_ADDRESS LIKE '%.NET' THEN 'KEEP'
WHEN t1.NVL_COUNTRY_GROUP = 'NULL' and t1.EMAIL_ADDRESS LIKE '%.ORG' THEN 'KEEP'
WHEN t1.NVL_COUNTRY_GROUP = 'NULL' and t1.EMAIL_ADDRESS LIKE '%.EDU' THEN 'KEEP'
WHEN t1.NVL_COUNTRY_GROUP = 'NULL' and t1.EMAIL_ADDRESS LIKE '%.GOV' THEN 'KEEP'
WHEN t1.NVL_COUNTRY_GROUP = 'NULL' and t1.EMAIL_ADDRESS LIKE '%.PRO' THEN 'KEEP'
WHEN t1.NVL_COUNTRY_GROUP = 'NULL' and t1.EMAIL_ADDRESS LIKE '%.INFO' THEN 'KEEP'
WHEN t1.NVL_COUNTRY_GROUP = 'NULL' and t1.EMAIL_ADDRESS LIKE '%.US' THEN 'KEEP'
WHEN t1.NVL_COUNTRY_GROUP = 'NULL' and t1.EMAIL_ADDRESS LIKE '%.CA' THEN 'KEEP'
WHEN t1.NVL_COUNTRY_GROUP = 'NULL' and t1.EMAIL_ADDRESS LIKE '%.AERO' THEN 'KEEP'
WHEN t1.NVL_COUNTRY_GROUP = 'NULL' and t1.EMAIL_ADDRESS LIKE '%.ARPA' THEN 'KEEP'
WHEN t1.NVL_COUNTRY_GROUP = 'NULL' and t1.EMAIL_ADDRESS LIKE '%.ASIA' THEN 'KEEP'
WHEN t1.NVL_COUNTRY_GROUP = 'NULL' and t1.EMAIL_ADDRESS LIKE '%.CAT' THEN 'KEEP'
WHEN t1.NVL_COUNTRY_GROUP = 'NULL' and t1.EMAIL_ADDRESS LIKE '%.COOP' THEN 'KEEP'
WHEN t1.NVL_COUNTRY_GROUP = 'NULL' and t1.EMAIL_ADDRESS LIKE '%.INT' THEN 'KEEP'
WHEN t1.NVL_COUNTRY_GROUP = 'NULL' and t1.EMAIL_ADDRESS LIKE '%.JOBS' THEN 'KEEP'
WHEN t1.NVL_COUNTRY_GROUP = 'NULL' and t1.EMAIL_ADDRESS LIKE '%.MOBI' THEN 'KEEP'
WHEN t1.NVL_COUNTRY_GROUP = 'NULL' and t1.EMAIL_ADDRESS LIKE '%.MUSEUM' THEN 'KEEP'
WHEN t1.NVL_COUNTRY_GROUP = 'NULL' and t1.EMAIL_ADDRESS LIKE '%.NAME' THEN 'KEEP'
WHEN t1.NVL_COUNTRY_GROUP = 'NULL' and t1.EMAIL_ADDRESS LIKE '%.TEL' THEN 'KEEP'
WHEN t1.NVL_COUNTRY_GROUP = 'NULL' and t1.EMAIL_ADDRESS LIKE '%.TRAVEL' THEN 'KEEP'
WHEN t1.NVL_COUNTRY_GROUP = 'NULL' and t1.EMAIL_ADDRESS LIKE '%.UM' THEN 'KEEP'
WHEN t1.NVL_COUNTRY_GROUP = 'POPULATED_COUNTRY' THEN 'KEEP'
ELSE 'REMOVE'
END
case when t2.nvl_country_group = 'NULL' and upcase(scan(t1.email_address, 2, '.@')) in ('COM', 'MIL', BIZ', 'NET', .... , 'UM') then 'KEEP'
else 'REMOVE'
end;
Couple of other options:
1. Create a data set with the list of domains you want to include in the strings and use that instead of typing it into your code. It makes it a bit easier to maintain in the long run.
2. From #1 you could create a format that would assign the KEEP/REMOVE values
3. Test NULL condition only once instead of every line, nest the CASE or IF/THEN statements if necessary.
@agoldberg2018 wrote:
Is there a more condensed way to write this? I want to know if I can do the "keep" group all in one line instead of needing to duplicate it for each variation of email domain.
CASE
WHEN t1.NVL_COUNTRY_GROUP = 'NULL' and t1.EMAIL_ADDRESS LIKE '%.COM' THEN 'KEEP'
WHEN t1.NVL_COUNTRY_GROUP = 'NULL' and t1.EMAIL_ADDRESS LIKE '%.MIL' THEN 'KEEP'
WHEN t1.NVL_COUNTRY_GROUP = 'NULL' and t1.EMAIL_ADDRESS LIKE '%.BIZ' THEN 'KEEP'
WHEN t1.NVL_COUNTRY_GROUP = 'NULL' and t1.EMAIL_ADDRESS LIKE '%.NET' THEN 'KEEP'
WHEN t1.NVL_COUNTRY_GROUP = 'NULL' and t1.EMAIL_ADDRESS LIKE '%.ORG' THEN 'KEEP'
WHEN t1.NVL_COUNTRY_GROUP = 'NULL' and t1.EMAIL_ADDRESS LIKE '%.EDU' THEN 'KEEP'
WHEN t1.NVL_COUNTRY_GROUP = 'NULL' and t1.EMAIL_ADDRESS LIKE '%.GOV' THEN 'KEEP'
WHEN t1.NVL_COUNTRY_GROUP = 'NULL' and t1.EMAIL_ADDRESS LIKE '%.PRO' THEN 'KEEP'
WHEN t1.NVL_COUNTRY_GROUP = 'NULL' and t1.EMAIL_ADDRESS LIKE '%.INFO' THEN 'KEEP'
WHEN t1.NVL_COUNTRY_GROUP = 'NULL' and t1.EMAIL_ADDRESS LIKE '%.US' THEN 'KEEP'
WHEN t1.NVL_COUNTRY_GROUP = 'NULL' and t1.EMAIL_ADDRESS LIKE '%.CA' THEN 'KEEP'
WHEN t1.NVL_COUNTRY_GROUP = 'NULL' and t1.EMAIL_ADDRESS LIKE '%.AERO' THEN 'KEEP'
WHEN t1.NVL_COUNTRY_GROUP = 'NULL' and t1.EMAIL_ADDRESS LIKE '%.ARPA' THEN 'KEEP'
WHEN t1.NVL_COUNTRY_GROUP = 'NULL' and t1.EMAIL_ADDRESS LIKE '%.ASIA' THEN 'KEEP'
WHEN t1.NVL_COUNTRY_GROUP = 'NULL' and t1.EMAIL_ADDRESS LIKE '%.CAT' THEN 'KEEP'
WHEN t1.NVL_COUNTRY_GROUP = 'NULL' and t1.EMAIL_ADDRESS LIKE '%.COOP' THEN 'KEEP'
WHEN t1.NVL_COUNTRY_GROUP = 'NULL' and t1.EMAIL_ADDRESS LIKE '%.INT' THEN 'KEEP'
WHEN t1.NVL_COUNTRY_GROUP = 'NULL' and t1.EMAIL_ADDRESS LIKE '%.JOBS' THEN 'KEEP'
WHEN t1.NVL_COUNTRY_GROUP = 'NULL' and t1.EMAIL_ADDRESS LIKE '%.MOBI' THEN 'KEEP'
WHEN t1.NVL_COUNTRY_GROUP = 'NULL' and t1.EMAIL_ADDRESS LIKE '%.MUSEUM' THEN 'KEEP'
WHEN t1.NVL_COUNTRY_GROUP = 'NULL' and t1.EMAIL_ADDRESS LIKE '%.NAME' THEN 'KEEP'
WHEN t1.NVL_COUNTRY_GROUP = 'NULL' and t1.EMAIL_ADDRESS LIKE '%.TEL' THEN 'KEEP'
WHEN t1.NVL_COUNTRY_GROUP = 'NULL' and t1.EMAIL_ADDRESS LIKE '%.TRAVEL' THEN 'KEEP'
WHEN t1.NVL_COUNTRY_GROUP = 'NULL' and t1.EMAIL_ADDRESS LIKE '%.UM' THEN 'KEEP'
WHEN t1.NVL_COUNTRY_GROUP = 'POPULATED_COUNTRY' THEN 'KEEP'
ELSE 'REMOVE'
END
@Reeza can you explain to me what
SCAN(t1.EMAIL_ADDRESS, 2, '.@')
means?
It actually didn't work so I am wondering if the .@ is wrong.. is there a link you know of I can reference for a deep explanation of this?
Thanks for your help!
SCAN() is trying to get the component of the email.
Emails are usually in the form of :
xyz@domain.TLD
@scan() says that the components can be separated by the fields in the third parameter, @ and dot (.)
2 says to take the second one.
XYX - Domain - TLD
So 2 would return the domain portion.
Looking back at your question you actually want the last portion so you can change it to 3.
If you have addresses of the form:
ABC.XYZ@DOMAIN.TLD you can use -1 to get the last component instead.
scan(t1.email_address, -1, '.@')
@agoldberg2018 wrote:
@Reeza can you explain to me what
SCAN(t1.EMAIL_ADDRESS, 2, '.@')
means?
It actually didn't work so I am wondering if the .@ is wrong.. is there a link you know of I can reference for a deep explanation of this?
Thanks for your help!
Yes! The -1 one makes the most sense because of people who separate words in the email address like firstname.lastname@abc.com
Thank you again for taking the time to explain this to me! I greatly appreciate the knowledge.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
What’s the difference between SAS Enterprise Guide and SAS Studio? How are they similar? Just ask SAS’ Danny Modlin.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.