BookmarkSubscribeRSS Feed
nakulkothari
Calcite | Level 5

I have a huge list of email addresses. I want to determine number of dictionary words in each email address. Programming language I am using is SAS.

 

Ex - suppose the email addresses are as below. The output I require is - coolgirl@email.com --> 2 dictionary words - cool and girl angeldream@gmail.como --> 2 dictionary words - angel and dream

 

Can anyone suggest how to go about it.

4 REPLIES 4
RW9
Diamond | Level 26 RW9
Diamond | Level 26

Whilst its easy enouhg to get a list of words off the net, my first search came up with this:

https://github.com/dwyl/english-words

 

The question is how are you going to lexicographically parse a text string to find words?  There are many combinations, different meanings, different spellings etc.  Just take your example: coolgirl, what if it was coolaid?  Two separate words, or the company name?  What about halfpipe, should it be half and pipe, or halfpipe?

 

I think your best bet would be to investigate text analytics if you really need to do this, although its another license:

http://www.sas.com/en_us/software/analytics/text-miner.html

PGStats
Opal | Level 21

Some word lists are available at https://sourceforge.net/projects/wordlist/files/latest/download?source=typ_redirect

 

Is this for targeted marketing? 

PG
nakulkothari
Calcite | Level 5
No, this is not for targeted marketing.

I am doing a project in which I need to determine number of dictionary words in the email handle.

I am stuck in the question. And don't know how to proceed
nakulkothari
Calcite | Level 5

No, this is not for targeted marketing.

I am doing a project in which I need to determine number of dictionary words in the email handle.

I am stuck in the question. And don't know how to proceed

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to choose a machine learning algorithm

Use this tutorial as a handy guide to weigh the pros and cons of these commonly used machine learning algorithms.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 4 replies
  • 1165 views
  • 0 likes
  • 3 in conversation