A recent customer requirement to allow searching for data items using multiple terms captured using text box object in Visual Analytics had me thinking.
To frame the problem it is important to understand the data representation for this particular use case. PDF documents are converted into structured datasets. Text analytics functions are performed on the data to identify specific key terms from the data. The terms matched in these documents are concatenated together with a separator (in this case single-pipe |) and stored in the datasets. The user wanted the ability to search for any combination of terms and the report objects would then filter the records containing all or any number of terms to display which documents contained the terms. A final ask form the customer was to use "out-of-the-box" Visual Analytics functionality and not use any custom objects or scripts or workarounds.
Below is a small example of the dataset to better help understand the use case.
The following video showcases the requirement.
First we start by placing a Text object in the report page. Next a parameter (ENVN_EPA_HAB_cat_param) is created and assigned to the text object's parameter role. Note that "multiple value" parameters are not supported in this use case.
Our text input object will allow the user to capture up to six terms separated by a comma. Note that using 6 terms in this scenario was by design and for your use cases you can define as many as you require following the same principles as described here.
Next we have to parse the entries typed into the Text object, the value stored in the parameter linked to the text object. To do this we require six calculated items. Each calculated item will serve to parse the parameter (ENVN_EPA_HAB_cat_param) by using the "GetWord" function. The example below shows the custom function to get the 3rd word in the typed list of terms.
The function GetWord follows the structure, input value, in this case the parameter (ENVN_EPA_HAB_cat_param) and the number of the word to be parsed, in this case the third word. The default word delimiter is comma. The six calculated items are named: EVEPHA_cat_1 through to EVEPHA_cat_6.
The second object we have to create on the report page is the List object and the column (keyTerms_EPA) which contains the terms we are searching against is mapped in the category role. We do this to display the result of the search after terms are input into the Text box above. This also allows this object to be linked to other objects in the report so that "Actions" can be mapped for interaction purposes.
The secret sauce is the next step in the process. We now create and advanced filter for the List object. Below is the structure of the advanced filter.
From this advanced filter definition you can see the first step is to ensure that the parameter (ENVN_EPA_HAB_cat_param) contains text values. If it does, the expression continues by assessing if the term field in the dataset contains the subsequent terms 1 through to 6. Using the AND allows multiple terms to be searched against. If for example the custom function EVEPHA_cat_5 returns a null, it is ignored in the filter. The advanced filter subsequently allows the List object to be filtered with the values entered in the Text object as soon as the user completes entering terms in the Text object.
Note however the process is only complete after the user clicks on the "Select all" option in the List object to ensure the List object recognizes the filtered items. All dependent objects linked to the List object will then be refreshed to reflect the filtered items.
Finding a solution to allowing a user to search through concatenated terms in a data field using a text box as term input was quite the challenge initially. There are most likely other approaches to solving this problem however after many tries this approach appears to function in a repeatable manner and one that provides the best performance.
If you have an alternative view on potentially solving this problem in a different way please comment below.
Registration is open! SAS is returning to Vegas for an AI and analytics experience like no other! Whether you're an executive, manager, end user or SAS partner, SAS Innovate is designed for everyone on your team. Register for just $495 by 12/31/2023.
If you are interested in speaking, there is still time to submit a session idea. More details are posted on the website.
Data Literacy is for all, even absolute beginners. Jump on board with this free e-learning and boost your career prospects.