BookmarkSubscribeRSS Feed
Peter_J
Fluorite | Level 6

just joined - have been using Text Miner only through the tutorials for a year or so in graduate classes - have the getting started and a help file - did not get to the following question:

 

have verbal answers to survey question already classified into several categories - used parsing, filtering, and decision tree to train a classification model - would like to use this model to classify similar unclassified data in a separate file - how do I do that using Text Miner?

 

seems like an obvious text mining procedure but could not find specific answer - am I describing it wrong?

 

using SAS OnDemand for Academics Enterprise Miner 14.4

2 REPLIES 2
CraigDeVault
SAS Employee

Peter,

 

Do you already have a target variable in your data set?  In order to run the Decision Tree, you need to have a target variable.

 

If you already have a target variable, then you can follow the steps on how to do this within the Getting Started with SAS Text Miner 13.2, found at the following URL:

   http://support.sas.com/documentation/cdl/en/tmgs/67510/PDF/default/tmgs.pdf

 

Go to Chapter 7 and you can see the steps to incorporate output from the text mining nodes into a modeling flow.

Peter_J
Fluorite | Level 6

Graig - thanks for the quick reply - yes, training set has a categorical target variable - in the Getting Started with SAS Text Miner 13.2 tutorial that I have I see nothing in Chapter 7 about how apply the model trained on observations in one file to un-categorized observations in another file - nowhere do I see how the second file either gets read into the project and run as a test set or to have the model built on the training set read into another project with the un-categorized test set

 

everything in Chapter 7 is from the single VAERS file read at the beginning - the way I understand the chapter is that the Decision Tree nodes are all applied to the test partition of the original data data after the Text Topics and Rule Builder nodes did their work, not from a test set of observations in another file

 

what I want to do is to build the model on the first phase of a multiple phase sampling study and have it applied to the subsequent samples collected possibly some time after the original sample

 

is there a way to save the results of the Decision Tree node from the first sample so that if a new sample is read into the project the model can be applied to this second sample to categorized its responses?

 

or if the model can be stored somewhere and a new project is created for the second sample can that model be retrieved form where is was stored and applied to the new sample?

 

after the results of the applying the model to the second sample are checked and found acceptable this might be a way to increase the training set on which an updated model is built for analyzing subsequent samples in a sort of bootstrap manner

 

am I an a reasonable track here? - peter j

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to choose a machine learning algorithm

Use this tutorial as a handy guide to weigh the pros and cons of these commonly used machine learning algorithms.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 2 replies
  • 1199 views
  • 0 likes
  • 2 in conversation