BookmarkSubscribeRSS Feed
nakulkothari
Calcite | Level 5

I have a huge list of email addresses. I want to determine number of dictionary words in each email address. Programming language I am using is SAS.

 

Ex - suppose the email addresses are as below. The output I require is - coolgirl@email.com --> 2 dictionary words - cool and girl angeldream@gmail.como --> 2 dictionary words - angel and dream

 

Can anyone suggest how to go about it.

4 REPLIES 4
RW9
Diamond | Level 26 RW9
Diamond | Level 26

Whilst its easy enouhg to get a list of words off the net, my first search came up with this:

https://github.com/dwyl/english-words

 

The question is how are you going to lexicographically parse a text string to find words?  There are many combinations, different meanings, different spellings etc.  Just take your example: coolgirl, what if it was coolaid?  Two separate words, or the company name?  What about halfpipe, should it be half and pipe, or halfpipe?

 

I think your best bet would be to investigate text analytics if you really need to do this, although its another license:

http://www.sas.com/en_us/software/analytics/text-miner.html

PGStats
Opal | Level 21

Some word lists are available at https://sourceforge.net/projects/wordlist/files/latest/download?source=typ_redirect

 

Is this for targeted marketing? 

PG
nakulkothari
Calcite | Level 5
No, this is not for targeted marketing.

I am doing a project in which I need to determine number of dictionary words in the email handle.

I am stuck in the question. And don't know how to proceed
nakulkothari
Calcite | Level 5

No, this is not for targeted marketing.

I am doing a project in which I need to determine number of dictionary words in the email handle.

I am stuck in the question. And don't know how to proceed

sas-innovate-2026-white.png



April 27 – 30 | Gaylord Texan | Grapevine, Texas

Registration is open

Walk in ready to learn. Walk out ready to deliver. This is the data and AI conference you can't afford to miss.
Register now and lock in 2025 pricing—just $495!

Register now

How to choose a machine learning algorithm

Use this tutorial as a handy guide to weigh the pros and cons of these commonly used machine learning algorithms.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 4 replies
  • 2235 views
  • 0 likes
  • 3 in conversation