Text mining and content categorization

Subject Importance for Filtering/Topic/Cluster

Senior User xav
Senior User
Posts: 1

Subject Importance for Filtering/Topic/Cluster

Hello everyone,


I am fairly new to text mining in SAS and I have been learning and reading through most SAS tutorials. I was wondering if there is a way to tell the miner what terms are the most important either in filter, topic, or cluster. I know there is a user defined in topics and I have tried using that (not extensively), I also know that I can use different term weights (entropy, inverse, default) to do this, however when I put it through it seems to have given me the same results.


Anyhow to put it into perspective, for example, let's say I am mining all the program requirements for all the universities in Canada. I will get English as probably the most important thing in there as every program requires English subject. However, as every high school student takes English, that's not as important as say something like Data Management. So I want to be able to train SAS to be able to read the document and know that Data Management is more important than English. Preferably without me having to go manually in each mining session and tell SAS that.


I don't know if I made sense, but any help would be greatly appreciated.


Thank you.

Ask a Question
Discussion stats
  • 0 replies
  • 1 in conversation