Help using Base SAS procedures

Text Mining Using Base SAS

Reply
Frequent Contributor
Posts: 144

Text Mining Using Base SAS

Hi All,

I'm trying to right a program which will sum up all the unique words in a variable -  this needs to be not case sensitive.  I need to use just Base SAS.

CONTENT
signs that I might be experiencing Candida? A. Here is our list of 100 Common Candida Symptoms associated with systemic. Use zyrtec
i    a week of sttn.. about 3 weeks ago.. LO had bad allergies so he had to take some zyrtec but he slept sooo good
Allergy Aid Cleansing Expert services Are Necessary Some folks are incapable of having a great night's sleep

So the output I'm looking for is something like this (this is not the full list):

Word Count
zyrtec2
Some2
of3
I2
he2
had 2
experiencing1
Common1
Cleansing1
Candida1
a2

Is there a way to write a do loop? It would obviously way too difficult  to you just string functions to do this.

Any assistance is greatly appreciated.

Thanks! 

Super User
Posts: 19,789

Re: Text Mining Using Base SAS

A few functions - countw, compress, scan, and lowcase can get you pretty far.

data words;

set have;

num_words=countw(sentence);

do i=1 to num_words;

word=lowcase(compress(scan(sentence, i), , 'ka'));

output;

end;

keep word;

run;

proc freq data=words;

table word;

run;

Ask a Question
Discussion stats
  • 1 reply
  • 370 views
  • 1 like
  • 2 in conversation