04-01-2017 04:43 AM
I have a question about TEXTFILTER_SNIPPET in a Text Filter Node in Text Miner Function in SAS Enterprise Miner 14.1 ver.
In Text Filter Node --> Interactive Filter Viewer, the textfilter_snippet shows a few more words along with the searched Term from the original text/sentence/document. And it usually shows 4 to 5 more words after the searched Term and then ellipsis mark. How can I set the snippet to extend and show the text into a full sentence from the original document or appear more words (10 words instead of 4 to 5 words) along with the searched term?
Does it able to be changed?
Hope anyone can understand my question.....
04-06-2017 07:16 PM
Thanks for answering!
By the way, is there any way that I can export the Search result from the Text Filter(Interactive filter viewer)?
I used a SAS code node to export it but then the exported data turns out to be less rows than if I just copy paste the Serach result into Excel. Browsing the textfilter_train content also shows less rows (about 1800 rows) than if I just copy paste the result into Excel (more than 2000 rows).....
Also when I do a big chunk amount of terms search, some of the result follow with an empty textfilter_snippet. I can see it pulls put the document that contains the terms I want, but some of their textfilter_snippet remain blanks.....
Is there anyway I could fix these two problems... or if my input is something wrong....?
04-09-2017 10:56 PM
When you save the changes from your search in the interactive Filter Viewer window, the collection is subsetted in the exported train dataset of the Text Filter node (see the "Exported Data" property on the Text Filter node). By the way, you can also put the search in the search property on the Text Filter node to begin with and omit the interactive Filter Viewer step, if you like.
I am guessing that the difference in the search results between the Text Filter node and excel is that the Text Filter node is doing a whole word search. In that node's search, a given query term will not match because of a substring match with a longer term. I am guessing the excel search is more character based rather than term based and will match on those substrings? Other than that explanation, you may want to identify a particualr instance in a particular document and there may be an explanation based on that drill down examination for you.
I am not sure of the missing snippets contents. If you can isolate a case, it might be informative. If you definitely see an issue there, please contact tech support so we can get i resolved if necessary. Thanks
Need further help from the community? Please ask a new question.