BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
mganesh10
Fluorite | Level 6

Hi,

 

I've a SAS Text Topic node that has a list of user defined topics. I have a set of documents adhering to the user defined topics in my result set from the TT node. I've glanced the data and apparently I found that I need to write a condition that uses Negation (~) operator for obtaining even more accurate results. To add more clarity to this, I am giving you this example. I've a topic called TOPIC_SODA and the term that identifies that topic is SODA. I've got the result set that has the term SODA in it. In the result set I found that there are set of documents that has SODA combined with one more term called SCOTCH and I don't want that document to appear under my topic(TOPIC_SODA) in the result set. So I decided to use my condition in the term classification as SODA & ~SCOTCH. Unfortunately I was not able to use this type of condition in the TT user topic declaration. Is there any node that I can use to feed user defined topic which is based on a condition to the Text Topic node? I need the exact result to form my custom topic using the condition that I've given above and use that in the Text Topic node to filter the documents. I've added the snippets that shows my result set and failed approach.


1.PNGFailed Topic.PNGSCOTCH and SODA output.PNG
1 ACCEPTED SOLUTION

Accepted Solutions
rayIII
SAS Employee

Once you have the SAS code mode running successfully, examine the data exported from that node. It will have the document*topics matrix along with any new topics you created in the SAS code node. 

 

you can hook up additional nodes to the SAS code node to plot the topics, save the exported data, build predictive models, and so on. But no more TT nodes should be needed. 

 

Hope this helps

 

Ray

View solution in original post

4 REPLIES 4
rayIII
SAS Employee

Since you have already created your topics, you could create one more called  TOPIC_SCOTCH. Then connect a SAS code node to your TT node and use the transformation language to create a new topic. 

 

data &em_export_train;
	set &em_import_data;
	if textTopic_1 and not textTopic_2 then Soda = 1; 
run; 

Now run the node and you will be able to filter your data by Soda.

 

Here the code assumes that TOPIC_SODA is a label for textTopic_1 and TOPIC_SCOTCH is a label for textTopic_2. To see the actual correspondence between labels and variable names, you can select the TT node, choose Exported Data, then Properties. Click on the Variables tab and select Label. This will show you the names of the topic variables that you should use in your Data Step code. 

 

This tip may provide some perspective. (Regular expressions are another possibility.)

 

Hope this helps.


Ray

 

 

mganesh10
Fluorite | Level 6

Okay. In this case all the documents that has the keyword scotch will be removed. I need to filter out only those documents that has both scotch and soda appearing together in it.

mganesh10
Fluorite | Level 6

Also I am not quite clear about how the connection between SAS Code node and TT node exists. If SAS code node comes after TT node, then how can I again see the documents falling under each topic just like how I am able to see in TT node? Should I connect one more TT node again after SAS Code node to view the documents related to each topic or what should I do in this case?

rayIII
SAS Employee

Once you have the SAS code mode running successfully, examine the data exported from that node. It will have the document*topics matrix along with any new topics you created in the SAS code node. 

 

you can hook up additional nodes to the SAS code node to plot the topics, save the exported data, build predictive models, and so on. But no more TT nodes should be needed. 

 

Hope this helps

 

Ray

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

How to choose a machine learning algorithm

Use this tutorial as a handy guide to weigh the pros and cons of these commonly used machine learning algorithms.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 4 replies
  • 1565 views
  • 1 like
  • 2 in conversation