A customer recently posed this question: “I know how to match a concept at the end of a sentence, but how can I match it at the conclusion of a document?” (For a review of concepts and other basics, see my previous post.) The purpose of this post is to match concepts near the end of a document using a technique that leverages multiple nodes. LITI is Language Interpretation for Textual Information, and it is the programming language used to code concept and category rules in SAS Visual Text Analytics.
Customer complaint call transcripts can end up being quite verbose. Sure, for starters, you could run a summary of the documents to get the main ideas being raised, and this would be a good thing to do. (see my post on summarizing documents). But isn’t there something else I could also try?
The complete customer interaction likely makes some progress over its duration. The follow-up actions or conclusions of the dialog can be expected to occur towards the end of the event, after the back-and-forth discussion has become more focused on the real issues. It may not be necessary to read through the entire interaction to gain some actionable insights.
You may be most interested in returning the matches for certain conditions that occur in the last 50 terms or so of an exchange between the customer and the call agent. The text near the close of the conversation likely holds information to help you determine any appropriate followup actions. Designing the structure of your rules thoughtfully can provide meaningful insights. You could also create a dashboard of these conclusions over time for extra credit.
Consider this example where a Concept rule finds matching terms occurring in a document based on a list of conditions and activities that are stored in supporting rules. In the consumer financial services area, you might have a concept rule containing LITI syntax like this:
CallCenterMatches
CONCEPT_RULE:(AND,(DIST_20,"_CONDITION_", "_c{_ACTIVITY_}"))
This rule matches documents that have an item from the Condition rule within 20 tokens of an item from the Activity rule. A match requires that at least one item from each table is in the document due to the use of the AND operator.
Two example tables below use the LITI language syntax for a table of conditions and a table of activities. The items from the columns can appear in any order. The @ symbol indicates morphological expansions of the term (all forms of a term like report, reports, reported, reporting…)
Select any image to see a larger version. Mobile users: To view the images, select the "Full" version at the bottom of the page.
Here are some of the 635 matches from SAS Visual Text Analytics of the CallCenterMatches rule with the activity name highlighted as the rule requested.
OK, this is great, but we already know how to use the CONCEPT_RULE and all its bells and whistles. What else can I do?
What if you have a long call transcript and only want to know what happens towards the end of the call? This kind of focus may do a better job than our CallCenterMatches rule on its own to identify followup actions with the customer. To do this, we’ll add a category rule to find a concept if it exists at the end of the interaction (call or chat or transcribed audio)
Next, we will create a category rule based on the concept and use an easily overlooked statement available in the Category node.
I can connect a categories node directly to a concepts node as shown in the right-hand branch of this pipeline, since both nodes can do their own parsing.
I added the following Categories rule so I could see only the matches at the end of the document. The syntax of the Categories node is different than the Concepts node, so this syntax is not valid for use in the Concepts node.
I created my new Category rule using this syntax:
(END_40, "[CallCenterMatches]")
The square brackets in double quotation marks tells the Categories node to use a concept rule rather than a term in the rule.
After applying the concept matches to the categories node in the last 40 tokens of these call reports, there are only 190 matches whereas previously there were 635 document matches in the concepts node.
This is the end of one of the matched documents with the highlighted conditions and actions terms within the final 40 tokens!
This is the description of the END_n Categories rule.
END_n (From the end of the document)
Takes a value for n and one or more arguments. Matches if each argument occurs within n tokens from the end of the document. For example, the rule (END_35, "conclusion") produces a match if conclusion is found within 35 words from the last token in the document. Note: Punctuation counts as a token. In some languages, words that include hyphens are counted as one word (for example, merry-go-round is one token).
So, there you have it! You can combine concept rules within categories to get specific matches based on your needs. There are many other hidden treasures in the land of LITI rules and who knows, I may reveal some more in posts yet to come!!!
Thanks for reading and let me know if you use a form of this rule.
BTW,
If you really want to learn this stuff, here is an exercise for you to try. See if you can identify what each of these rules will return. Do some rules not give you what you are really looking for? Do any of these rules return more than the information we are looking for? Read about the DIST_n, AND, OR, and _c operators for Concept rules for hints.
CONCEPT_RULE:(DIST_20, "_c{_CONDITION_}", "_ACTIVITY_")
CONCEPT_RULE:(AND, "_c{_CONDITION_}", (OR, "_ACTIVITY_"))
CONCEPT_RULE:(OR,(DIST_20, "_c{_CONDITION_}", "_c{_ACTIVITY_}"))
CONCEPT_RULE:(AND,(DIST_20,(OR,"_c{_CONDITION_}", "_c{_ACTIVITY_}")))
Find more articles from SAS Global Enablement and Learning here.
... View more