Programmatic Manipulation of Concept Rules in SAS® Visual Text Analytics (VTA)
Note: The following article references the use of APIs that are not officially documented by SAS for external use and are not supported by SAS Technical support.
While Model Studio offers a strong interface for developing VTA projects, we identified a need for greater flexibility to support rapid development and collaboration among multiple analysts. In this article, we’ll describe how we built custom processes to enhance the efficiency and transparency of working with taxonomies created within Concept nodes.
The enhanced functionality provides the following benefits:
Process to export concept rules from any taxonomy
Programmatic merging and editing of concept rules
Automated population of an existing Concept node with original or modified rules
Exporting Concept Rules
Project Details
Project Name
Project Data
Node Details
Pipeline Name
Node Name
Taxonomy UUID
Concept Details
Concept Name
Concept Hierarchical Path
Concept Details (continued)
Rule Definition (including comments)
Rule Priority
Case Sensitive Flag
Primary / Supporting Flag
Concept Metadata
Created By
Created Timestamp
Modified By
Modified Timestamp
Figure 1: Exported table for a single VTA project, Color_Project displaying all columns of data.
Working with the Exported Concept Rules
Exporting concept rules into tabular data opens up numerous opportunities to enhance the development cycle. One example of this increased flexibility and collaboration is the ability to combine multiple taxonomies into a single rule set. Users can also create and maintain a centralized list of commonly used concepts, which can be reused across multiple taxonomies. Additionally, rule sets created by multiple analysts can be merged into a single, unified taxonomy.
The export process also generates a line-by-line view of the rule definitions. This format, combined with the accompanying metadata, makes it easy to identify and track changes to the rule set over time. For example, a single custom concept may include hundreds of individual concept definition rules and associated comments. Comments can be used to describe the purpose of a rule or set of rules within a custom concept. Comparing the output line by line allows users to quickly identify which rules were added, deleted, or modified.
Figure 2: Subset of the exported table in Figure 1 showing multiple observations for a single concept.
Using Exported Concept Rules to Create a Structured Directory of Concept Definition Text Files
Another potential use for the exported tabular concept definition data is the further processing of this information to create a series of text files containing the concept definition rules (one text file per concept). These text files could then be placed in a version control repository such as Git or SVN to allow the developer to easily archive and/or view changes to the concept’s rules over time, as well as share the concept rule files with other developers.
The code provided in this article, concept_rules_2text_files.sas, contains an example process which automatically builds out the structured directory of text files using the exported concept dataset as the sole input. Once executed the user will have access to a text (.txt) file containing the concept definition rules for each concept, as well as an associated info (.info) file containing metadata about the associated concept (such as last modified date, whether the concept is case sensitive, and other related information).
Importing Concept Rules
Importing concept rules back into Model Studio requires a blank Concept node within any project, along with the exported concept rule table. The import process loops through all exported rules and adds the definitions to the specified Model Studio project. Once the process is complete, the rule definitions appear within the Concept node. Use import_concept_rules.sas to execute this import process.
Note: For Concept nodes that use predefined concepts, the user must manually enable this option in the Model Studio property panel.
Figure 3: Concepts for the VTA project named Color_Project, as shown in SAS Model Studio user interface, without the Predefined Concepts enabled.
Conclusion: Part 2
We hope these tools enhance your experience with SAS Visual Text Analytics and support more agile model development. You can access the programs in our public repository: sas-vta-examples.
Please use these suggestions as you see fit and continue to share any tips and tricks you’ve found useful in your VTA practice in the comments section. In the meantime, stay tuned for our final article of this series, Part 3, focused on tracking your changes to a VTA project.
... View more