BookmarkSubscribeRSS Feed

Need some inspiration for better SAS Visual Text Analytics concept matches? Let’s talk.

Started 2 weeks ago by
Modified 2 weeks ago by
Views 105

The purpose of this post is to describe how to enhance concept rules to retrieve better focused information from a collection of documents.

 

We'll explore concept rules and ideas for refining them in the context of an analyst processing a large number of customer complaints at a large bank.

 

Better concept matches start with interactive exploration in SAS Visual Text Analytics.

 

For an introduction to the capabilities in SAS Visual Text Analytics see my post.

 

Concepts are used to extract relevant information from documents. This analytical journey starts with exploratory analysis. When processing customer complaint documents, concepts help us: identify conditions where someone needs help, identify opportunities to improve customer service, discover social media comments to address before a complaint goes viral, and more.

 

First some freebie tips.

 

01_saspchj1.png

Select any image to see a larger version.
Mobile users: To view the images, select the "Full" version at the bottom of the page.

 

Predefined concepts are automatically identified by checking a box in the properties panel. These use Natural Language Processing (NLP) techniques to identify concepts such as measures, money, organizations, people, places and more. The nlpMeasures concept for example, returns matches for terms such as (2 years, two days, 5 mg, $250., 8 feet). The nlpMoney concept matches anything related to currency such as ($2, 2.50€, 40 yen, 20 pounds).

 

You can combine predefined concepts with new custom concepts to easily identify documents having specific information. Keep these freebies in mind to assist you with building your own concepts.

 

For example, the rule argument (SENT,”_c{cashed}”, ”nlpMoney”}) returns matches for any document that has the term “cashed” and any currency amount in the same sentence. SAS will recognize currency whether it is U.S. dollars, British pounds, etc.

 

Your mission…

 

Let’s consider an example scenario.

 

You work for a bank and one of your responsibilities is to evaluate 10,000 recent customer complaints that came in through various channels and are available in computer readable text. Your mission is to investigate complaints related to the handling of cash and recommend actions your company might take to improve the customer’s experience.

 

What is the first thing you would try?

 

Exactly, Visual Text Analytics!

 

Open the documents in SAS Visual Text Analytics, run the text parsing node and look at the kept terms and list of documents. You might then try searching the documents for the word cash and see if this gives you any ideas. Your search returns 700 documents and you see phrases in the documents like:

 

Cash a check

Cash advance

Cash back

Cash price

Cash deposit / withdrawal

 

02_saspchj2.png

 

Remember these pointers to structure an effective search:

 

  • Placing the "+" in front of a word requires the word to be in a document.
  • A term without a preceding + returns the term if it exists in the document but it does not have to be in the document.
  • A "" in front of a word finds documents that do not have that term.
  • Place a * at the beginning, middle or end of a search query to get wildcard matches. The query term +*ion returns terms such as exception or action.
  • Quotation marks return the specific content in between such as “fraudulent check”.

 

You might then shift your attention to the kept terms list at the top and search for cash again and notice the following.

  

03_saspchj3.png

 

In this display, cash is recognized as a noun, verb, proper noun, and in a noun group by the NLP process. You decide to look at the verb cash in the term map next.

 

04_saspchj4.png

 

This map shows the terms related to cash based on information gain. One path from this chart suggests that some documents contain (cash, check, issue) and do not contain (offer or deposit) where the ~ sign indicates “not”. These ideas represent new insights to consider as you construct new custom concept rules.

 

In the parsing node you can click the similarity scores icon, remove “cash” from the kept terms filter, and notice the additional new terms that appear. “Check fraud” and “forged check” didn’t initially come to mind when exploring cash, but you may want to expand your investigation for occurrences of fraud related to cash!

 

05_saspchj5.png

 

Concepts Node

 

Next on your journey, you add and run a concepts node to identify predefined concepts in the documents and then create a custom concept looking for various morphological expansions of the term “cash”, with fraud and check. The initial LITI (language interpretation for textual information) CONCEPT and CLASSIFIER rules follow.

 

06_saspchj6.png

 

These results look promising, but a lot of documents matched. You wonder if there might be some additional ideas for rules. Click the Autogenerate concept rules icon in the concept that has just 3 simple rules.

 

07_saspchj7.png

 

The fresh new rules were written to the sandbox. The generated rules will need more explanation especially if the LITI syntax is new to you. We plan to cover some of these in the next post, but here is a sample of the autogenerated rules in the meantime.

 

After building concept rules, the resulting concepts model can score the current or new documents for concept matches. SAS Visual Analytics can also be used to report on the results.

 

If you can’t wait until the next post to make sense of these rules, reference this documentation, otherwise, I look forward to sharing some more insights with you in the next post! It's your move!

 

08_saspchj9.png

 

 

Find more articles from SAS Global Enablement and Learning here.

Version history
Last update:
2 weeks ago
Updated by:
Contributors

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!

Free course: Data Literacy Essentials

Data Literacy is for all, even absolute beginners. Jump on board with this free e-learning  and boost your career prospects.

Get Started

Article Labels
Article Tags