Building models with SAS Enterprise Miner, SAS Factory Miner, SAS Visual Data Mining and Machine Learning or just with programming

Different results when going through Getting Started with SAS Text Miner 4.2 documentation

Reply
New Contributor
Posts: 2

Different results when going through Getting Started with SAS Text Miner 4.2 documentation

Hello,

I have been trying to follow the example starting at using the Text Parsing Node and I am stuck already.

http://support.sas.com/documentation/cdl/en/tmgs/63281/HTML/default/viewer.htm#n0jrlfs7yzlrs4n1rpuwv...

The example below shows that you should get this back from the Text Parsing Node by making the settings it suggests, but I only get that when I make those same settings to the Text Miner Node.

tm1.jpg

Here is what I get from the Text Parsing Node.  I cannot figure out how to get rid of the punctuation. (. , / ( ) ;  ) etc.  I had added to a stop list, chosen to ignore, etc.

tm2.jpg

Any ideas?

SAS Employee
Posts: 2

Re: Different results when going through Getting Started with SAS Text Miner 4.2 documentation

Posted in reply to TommyDrama

Tommy,

From what I can see, there appears to be two issues with your results. First, your Role column is blank and second, you are not filtering out punctuation. Adjust the Text Parsing node properties Ignore Parts of Speech and Ignore Types of Attributes as indicated below.

An empty Role column indicates to me that you have not property set the Ignore Parts of Speech property. To do so, click the ellipsis button next to the Ignore Parts of Speech property to open the Ignore Parts of Speech window. In the Ignore Parts of Speech window, use Ctrl+A to select everything, then hold the Control key and click on Noun to deselect that option. Only Noun should be unselected. Click OK.

Punctuation in your results indicates that the Ignore Types of Attributes property is not set correctly. By default, the Text Parsing node filters Numbers and Punctuation out of the results. To ensure that Ignore Types of Attributes property is set correctly, click the ellipsis button next to the Ignore Types of Attributes property. In the Ignore Types of Attributes window, select Punct (and any other attributes you wish to filter) from the list of options.

Additionally, no Stop List should be necessary for this example.

New Contributor
Posts: 2

Re: Different results when going through Getting Started with SAS Text Miner 4.2 documentation

Thanks Gerakios.

Unfortunately, I wish the problem were that easy.  I had the Ignore Types of Attributes and Ignore Parts of Speech set properly as you suggested above.  We applied the E25003 hot fix for SAS Text Miner, but it did not work.  I have been working with SAS Technical Support and they have handed the problem to their developer.  I was just curious to see if anyone else had seen this problem.

Ask a Question
Discussion stats
  • 2 replies
  • 277 views
  • 0 likes
  • 2 in conversation