Tips and Tricks for Power Users of SAS® Visual Text Analytics: Part 2 of 3 (API Hacks)

1 Like

Programmatic Manipulation of Concept Rules in SAS® Visual Text Analytics (VTA)

Note: The following article references the use of APIs that are not officially documented by SAS for external use and are not supported by SAS Technical support.

While Model Studio offers a strong interface for developing VTA projects, we identified a need for greater flexibility to support rapid development and collaboration among multiple analysts. In this article, we’ll describe how we built custom processes to enhance the efficiency and transparency of working with taxonomies created within Concept nodes.

The enhanced functionality provides the following benefits:

Process to export concept rules from any taxonomy
Programmatic merging and editing of concept rules
Automated population of an existing Concept node with original or modified rules

Exporting Concept Rules

Project Details
- Project Name
- Project Data
Node Details
- Pipeline Name
- Node Name
- Taxonomy UUID
Concept Details
- Concept Name
- Concept Hierarchical Path

Concept Details (continued)
- Rule Definition (including comments)
- Rule Priority
- Case Sensitive Flag
- Primary / Supporting Flag

Concept Metadata
- Created By
- Created Timestamp
- Modified By
- Modified Timestamp

Figure 1: Exported table for a single VTA project, Color_Project displaying all columns of data.

Working with the Exported Concept Rules

Exporting concept rules into tabular data opens up numerous opportunities to enhance the development cycle. One example of this increased flexibility and collaboration is the ability to combine multiple taxonomies into a single rule set. Users can also create and maintain a centralized list of commonly used concepts, which can be reused across multiple taxonomies. Additionally, rule sets created by multiple analysts can be merged into a single, unified taxonomy.

The export process also generates a line-by-line view of the rule definitions. This format, combined with the accompanying metadata, makes it easy to identify and track changes to the rule set over time. For example, a single custom concept may include hundreds of individual concept definition rules and associated comments. Comments can be used to describe the purpose of a rule or set of rules within a custom concept. Comparing the output line by line allows users to quickly identify which rules were added, deleted, or modified.

Figure 2: Subset of the exported table in Figure 1 showing multiple observations for a single concept.

Using Exported Concept Rules to Create a Structured Directory of Concept Definition Text Files

Another potential use for the exported tabular concept definition data is the further processing of this information to create a series of text files containing the concept definition rules (one text file per concept). These text files could then be placed in a version control repository such as Git or SVN to allow the developer to easily archive and/or view changes to the concept’s rules over time, as well as share the concept rule files with other developers.

The code provided in this article, concept_rules_2text_files.sas, contains an example process which automatically builds out the structured directory of text files using the exported concept dataset as the sole input. Once executed the user will have access to a text (.txt) file containing the concept definition rules for each concept, as well as an associated info (.info) file containing metadata about the associated concept (such as last modified date, whether the concept is case sensitive, and other related information).

Importing Concept Rules

Importing concept rules back into Model Studio requires a blank Concept node within any project, along with the exported concept rule table. The import process loops through all exported rules and adds the definitions to the specified Model Studio project. Once the process is complete, the rule definitions appear within the Concept node. Use import_concept_rules.sas to execute this import process.

Note: For Concept nodes that use predefined concepts, the user must manually enable this option in the Model Studio property panel.

Figure 3: Concepts for the VTA project named Color_Project, as shown in SAS Model Studio user interface, without the Predefined Concepts enabled.

Conclusion: Part 2

We hope these tools enhance your experience with SAS Visual Text Analytics and support more agile model development. You can access the programs in our public repository: sas-vta-examples.

Please use these suggestions as you see fit and continue to share any tips and tricks you’ve found useful in your VTA practice in the comments section. In the meantime, stay tuned for our final article of this series, Part 3, focused on tracking your changes to a VTA project.