BookmarkSubscribeRSS Feed
deleted_user
Not applicable
Hi,

I've been a SAS user for the last few years but I'm new to Text Mining. At the outset I would like to let you know that I don't have access to SAS Text Miner/SAS Enterprise Miner. I just have SAS EG with me.

The challenge I'm facing is with regards to Text Search and Count from a "comment" field in a market research survey. My goal is to count the number of occurences of words other than prepositions/articles/conjunctions etc. This gives me an idea about what people are trying to convey using the open comments. I need to do this using SAS Code and not any of the Text Mining Software from SAS.

It would help if someone can point to what is the best way to achieve this. What are the steps I should take? Even if simple pointers are given, I can build the code.

Thanks in advance for your help.

Prakash
2 REPLIES 2
T_Rex
Calcite | Level 5
I'm an old Base SAS user and have done some work with the INDEX functions and Macro processing to evaluate text data. Considering EG is your only tool, I would use the "Code Node" and the Base SAS functions & Macros with Data Step programming. Of course, you'll need to find (or build) a database containing the content you are looking for or you can use several other methods. But, I would start with the basics and build from your research. The SAS Online Docs for Base SAS would be very helpful in this regard.
RussAlbright
SAS Employee
Prakash,

This would be much easier with Text Miner because it can distinguish when terms are being used as prepositions/articles/conjunctions etc. rather than being purely string based. I am sure your entire analysis would benefit from other features of Text Miner as well.

On the Base SAS side there are many string functions. One thought is just to write out every space delimited term for each doc and then use proc freq. There are some examples of writing out individual terms using the SCAN function on this web page
http://support.sas.com/documentation/cdl/en/lrdict/61724/HTML/default/a000214639.htm

SAS also has a relatively new hash object that allows you to accumulate counts inside the data step if you would like to avoid the proc freq call.

Russ

Register today and join us virtually on June 16!
sasglobalforum.com | #SASGF

View now: on-demand content for SAS users

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to choose a machine learning algorithm

Use this tutorial as a handy guide to weigh the pros and cons of these commonly used machine learning algorithms.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 2 replies
  • 2255 views
  • 0 likes
  • 3 in conversation