Building models with SAS Enterprise Miner, SAS Factory Miner, SAS Visual Data Mining and Machine Learning or just with programming

SAS Code to write conditions for user defined topics

Accepted Solution Solved
Reply
Occasional Contributor
Posts: 8
Accepted Solution

SAS Code to write conditions for user defined topics

[ Edited ]

Hi,

 

I've a SAS Text Topic node that has a list of user defined topics. I have a set of documents adhering to the user defined topics in my result set from the TT node. I've glanced the data and apparently I found that I need to write a condition that uses Negation (~) operator for obtaining even more accurate results. To add more clarity to this, I am giving you this example. I've a topic called TOPIC_SODA and the term that identifies that topic is SODA. I've got the result set that has the term SODA in it. In the result set I found that there are set of documents that has SODA combined with one more term called SCOTCH and I don't want that document to appear under my topic(TOPIC_SODA) in the result set. So I decided to use my condition in the term classification as SODA & ~SCOTCH. Unfortunately I was not able to use this type of condition in the TT user topic declaration. Is there any node that I can use to feed user defined topic which is based on a condition to the Text Topic node? I need the exact result to form my custom topic using the condition that I've given above and use that in the Text Topic node to filter the documents. I've added the snippets that shows my result set and failed approach.


1.PNGFailed Topic.PNGSCOTCH and SODA output.PNG

Accepted Solutions
Solution
‎08-17-2016 09:33 PM
SAS Employee
Posts: 106

Re: SAS Code to write conditions for user defined topics

Posted in reply to mganesh10

Once you have the SAS code mode running successfully, examine the data exported from that node. It will have the document*topics matrix along with any new topics you created in the SAS code node. 

 

you can hook up additional nodes to the SAS code node to plot the topics, save the exported data, build predictive models, and so on. But no more TT nodes should be needed. 

 

Hope this helps

 

Ray

View solution in original post


All Replies
SAS Employee
Posts: 106

Re: SAS Code to write conditions for user defined topics

Posted in reply to mganesh10

Since you have already created your topics, you could create one more called  TOPIC_SCOTCH. Then connect a SAS code node to your TT node and use the transformation language to create a new topic. 

 

data &em_export_train;
	set &em_import_data;
	if textTopic_1 and not textTopic_2 then Soda = 1; 
run; 

Now run the node and you will be able to filter your data by Soda.

 

Here the code assumes that TOPIC_SODA is a label for textTopic_1 and TOPIC_SCOTCH is a label for textTopic_2. To see the actual correspondence between labels and variable names, you can select the TT node, choose Exported Data, then Properties. Click on the Variables tab and select Label. This will show you the names of the topic variables that you should use in your Data Step code. 

 

This tip may provide some perspective. (Regular expressions are another possibility.)

 

Hope this helps.


Ray

 

 

Occasional Contributor
Posts: 8

Re: SAS Code to write conditions for user defined topics

Okay. In this case all the documents that has the keyword scotch will be removed. I need to filter out only those documents that has both scotch and soda appearing together in it.

Occasional Contributor
Posts: 8

Re: SAS Code to write conditions for user defined topics

[ Edited ]

Also I am not quite clear about how the connection between SAS Code node and TT node exists. If SAS code node comes after TT node, then how can I again see the documents falling under each topic just like how I am able to see in TT node? Should I connect one more TT node again after SAS Code node to view the documents related to each topic or what should I do in this case?

Solution
‎08-17-2016 09:33 PM
SAS Employee
Posts: 106

Re: SAS Code to write conditions for user defined topics

Posted in reply to mganesh10

Once you have the SAS code mode running successfully, examine the data exported from that node. It will have the document*topics matrix along with any new topics you created in the SAS code node. 

 

you can hook up additional nodes to the SAS code node to plot the topics, save the exported data, build predictive models, and so on. But no more TT nodes should be needed. 

 

Hope this helps

 

Ray

☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 4 replies
  • 573 views
  • 1 like
  • 2 in conversation