BookmarkSubscribeRSS Feed

The New Unmatched Documents Tab in SAS Visual Text Analytics

Started ‎07-15-2025 by
Modified ‎07-15-2025 by
Views 250

The purpose of this post is to shed light on the new Unmatched Documents tab when working with certain SAS Visual Text Analytics nodes in Model Studio. This post assumes readers are already familiar with Visual Text Analytics (VTA) software running in Model Studio.

 

First, I must admit I’m using the term “new” a bit loosely. The Unmatched Documents tab has been available since the 2024.11 stable release of SAS Viya. This SAS Viya release dates back to November 2024. So, given the rate at which technology is advancing these days, some may consider this to be an “old” addition! I think, however, to most VTA users (me included) this will come as fresh light and a feature they have not seen nor used before. Please do not take offense that I’ll continue to refer to this as a “new” feature throughout this post. I also stated above that this new tab is available for “certain” nodes when using VTA in Model Studio. I will not include screen captures or examples for this new tab in all the nodes it appears in, because that would be overly redundant. Once you get a feel for what information this tab provides and how it helps the analyst, I’m pretty sure you’ll get the point of it. So, to get this conversation out of the way, the new Unmatched Documents tab is available in the Interactive Window for the Categories node, the Concepts node, the Text Parsing node, and the Topics node. If you use VTA regularly in Model Studio, you are familiar with the Matched Documents tab available in the Interactive Window for these nodes. Well, the Unmatched Documents tab is like the Matched Documents tab, but it simply shows the documents that do not have matches for the selected category, concept, term, or topic.

 

I’ll illustrate the new tab by working through an example initially in the Concepts node. The example uses data that is based on feedback from patients taking medication for depression or anxiety. The data have been fully anonymized and even the names for the drugs are artificial. We are trying to extract drug dosages from the patient feedback data.

 

After running the concepts node for a VTA project, the node can be opened in the Interactive Window. For this example, custom LITI code using the REGEX rule was written to extract drug dosages in milligrams (mg). Below is the Interactive window for the Concepts node with the rules for the custom concept shown.

 

01_JT_concepts_node_open_dosage-1536x696.png

 

Select any image to see a larger version.
Mobile users: To view the images, select the "Full" version at the bottom of the page.

 

 

When looking at the usual Document portion of the window, the new Unmatched Tab is immediately visible adjacent to the Matched documents tab.

 

02_JT_unmatched-tab.png

 

In the Documents window, the All tab indicates that the data are made up of 1,414 documents. Each document is a written comment from a patient taking depression or anxiety medication. Below is a screen shot showing typical comments.

 

03_JT_Concepts_node_ALL_dosage-1024x307.png

 

 

Clicking the Matched tab shows that 300 of the 1,414 documents contain a dosage amount. The two documents below show matches of 225mg and 150mg, respectively.

 

04_JT_concepts_node_matched_dosage-1024x306.png

 

 

Clicking the Unmatched tab shows that 1,114 documents (1,414 total documents – 300 matched documents) did not contain matches for a dosage.

 

05_JT_concepts_node_unmatched_dosage-1024x310.png

 

 

One reason the analyst might have for reviewing documents without matches is to examine documents that may have missed the concept which is being extracted. A reason that a concept may be missed would perhaps be due to the custom LITI code not being written accurately. So, in a way, perusing unmatched documents could assist in debugging or improving the LITI code. (In the current example, the analyst may consider using the search functionality to look for unmatched documents that contain “mg”.) Here’s an example of what I mean. Scrolling down the Unmatched list shows the following document:

 

06_JT_concepts_node_unmatched_missed_mg_dosage.png

 

 

We see that the document contains a dosage, but the dosage appears to be a range (2-5mg) rather than an individual amount. The analyst could then decide if they want to rewrite the LITI code to account for ranges of dosage amounts or perhaps ignore this if only individual dosages are to be extracted.

 

Let’s take a look at the Unmatched tab for another node. Below, the text parsing node has been run and we are investigating the term “depression” in the same data described above. The Unmatched tab in the Documents window has been selected.

 

07_JT_text_parsing_unmatched_depression-1024x467.png

 

 

We see that of the 1,414 total documents, 492 contain the term depression and 922 do not. Since the Unmatched tab has been selected, documents that do not contain the term depression are shown. The analyst might want to investigate the unmatched documents for common misspellings of the term of interest or to gain insight into, in other words, get a feel for, documents lacking the selected term.

 

As stated earlier, the Unmatched tab is also available in the Interactive Window for the Topics and Categories nodes.

 

I hope this explanation of the “new” (or “old”?) Unmatched documents tab in the Interactive Windows for VTA software has been helpful. I’d love to hear from you if you take advantage of this new VTA feature. Please leave me a comment if you use it in an analysis. I’d love to have examples of how analysts take advantage of this feature so I can share uses of it with customers that may take my course in the future!

 

For more on:

 

Training in Text Analytics

 

SAS Visual Text Analytics

 

VTA Pipeline Overview

 

The Text Window Feature for Term Maps

 

 

Find more articles from SAS Global Enablement and Learning here.

Contributors
Version history
Last update:
‎07-15-2025 01:02 PM
Updated by:

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

SAS AI and Machine Learning Courses

The rapid growth of AI technologies is driving an AI skills gap and demand for AI talent. Ready to grow your AI literacy? SAS offers free ways to get started for beginners, business leaders, and analytics professionals of all skill levels. Your future self will thank you.

Get started

Article Labels
Article Tags