BookmarkSubscribeRSS Feed
fr4k
Calcite | Level 5

Hi i'm using Enterprise Miner to classify a quite large number of articles using their keywords. 

So my purpose is to clustering them on the base of their keywords similarity.

In the editor of TextFilter node i aggregated some keywords that had almost the same meaning (i.e i aggregate text_analysis and texture_analysis) and i saved the changes.

I would expect that from this moment sas would treat the 2 words as 1 but when i run the TextCluster node i see that in the cluster's descriptive terms both +text_analysis and texture_analysis appear. 

How can i exlude from the list texture_analysis which is already contained in +text_analysis? 

 

Thank you in advance

1 REPLY 1
RussAlbright
SAS Employee

Only the kept terms are chosen as descriptive terms so it is more likely that the mapping isnt happening as you would like. Try setting the synonyms in the parse node and rerunning. Double check the terms in the filter viewer that they are mapped as expected and rerun the clustering after that.

 

If part-of-speech tagging is on, another alternative explanation is that "texture_analysis" is being tagged in several ways and you see in the descriptive terms a version that wasn't mapped to "text_analysis". The descriptive terms entry in the data set does not show the  part-of-speech tag so it is possible for the same term (without the tag) to show up in different ways in that descriptive term report becuase it occurred multiple times and with different part-of-speech tags.

 

 

 

Russ


Register today and join us virtually on June 16!
sasglobalforum.com | #SASGF

View now: on-demand content for SAS users

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
How to choose a machine learning algorithm

Use this tutorial as a handy guide to weigh the pros and cons of these commonly used machine learning algorithms.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 1 reply
  • 982 views
  • 0 likes
  • 2 in conversation