BookmarkSubscribeRSS Feed
TommyDrama
Calcite | Level 5

Hello,

I have been trying to follow the example starting at using the Text Parsing Node and I am stuck already.

http://support.sas.com/documentation/cdl/en/tmgs/63281/HTML/default/viewer.htm#n0jrlfs7yzlrs4n1rpuwv...

The example below shows that you should get this back from the Text Parsing Node by making the settings it suggests, but I only get that when I make those same settings to the Text Miner Node.

tm1.jpg

Here is what I get from the Text Parsing Node.  I cannot figure out how to get rid of the punctuation. (. , / ( ) ;  ) etc.  I had added to a stop list, chosen to ignore, etc.

tm2.jpg

Any ideas?

2 REPLIES 2
gerakios
SAS Employee

Tommy,

From what I can see, there appears to be two issues with your results. First, your Role column is blank and second, you are not filtering out punctuation. Adjust the Text Parsing node properties Ignore Parts of Speech and Ignore Types of Attributes as indicated below.

An empty Role column indicates to me that you have not property set the Ignore Parts of Speech property. To do so, click the ellipsis button next to the Ignore Parts of Speech property to open the Ignore Parts of Speech window. In the Ignore Parts of Speech window, use Ctrl+A to select everything, then hold the Control key and click on Noun to deselect that option. Only Noun should be unselected. Click OK.

Punctuation in your results indicates that the Ignore Types of Attributes property is not set correctly. By default, the Text Parsing node filters Numbers and Punctuation out of the results. To ensure that Ignore Types of Attributes property is set correctly, click the ellipsis button next to the Ignore Types of Attributes property. In the Ignore Types of Attributes window, select Punct (and any other attributes you wish to filter) from the list of options.

Additionally, no Stop List should be necessary for this example.

TommyDrama
Calcite | Level 5

Thanks Gerakios.

Unfortunately, I wish the problem were that easy.  I had the Ignore Types of Attributes and Ignore Parts of Speech set properly as you suggested above.  We applied the E25003 hot fix for SAS Text Miner, but it did not work.  I have been working with SAS Technical Support and they have handed the problem to their developer.  I was just curious to see if anyone else had seen this problem.

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to choose a machine learning algorithm

Use this tutorial as a handy guide to weigh the pros and cons of these commonly used machine learning algorithms.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 2 replies
  • 899 views
  • 0 likes
  • 2 in conversation