BookmarkSubscribeRSS Feed

Scoring SAS Visual Text Analytics Models with Analytic Stores in SAS Studio

Started ‎06-23-2023 by
Modified ‎06-23-2023 by
Views 550

The purpose of this article is to show how to use the SAS Visual Text Analytics models to score an external text data in SAS Studio using Astore method. This functionality enables you to score a collection of documents to extract information in a given context (using Concepts model), to discover the themes or topics in the document collection (using Topics model) and to categorize the documents (using Categories model).

SAS Visual Text Analytics is a web-based text analytics application on SAS Viya that combines the power of Natural Language Processing (NLP), machine learning, and linguistic rules to reveal insights in text data. SAS Visual Text Analytics builds several types of text models, including contextual models, sentiment models, topic models, and categorization models. As the new documents appear, they can be scored using the models developed in Visual Text Analytics.

The narrative that follows assumes that you have already created a pipeline in Model Studio. Pipelines are structured flows of analytic actions. These analytic actions are represented as individual nodes in a pipeline.

 

Downloading Score code from analysis nodes in SAS Visual Text Analytics

To download score code from an analysis node, complete the following steps:

  • Navigate to the Pipelines tab in Model Studio and run the pipeline.
  • When the pipeline run is complete, right-click on the analysis node (for e.g., Concepts node) that you want to download the score code for and select Download score code.
    • When you download score code from a Concepts node, the resulting ZIP file contains the following: the concepts model (ConceptsModel.li) and its associated score code (ScoreCode.sas), and the concepts analytic store (ConceptsModel.astore) and its associated score code (AstoreScoreCode.sas). The ConceptsModel.li file contains the compiled LITI concepts and is used by ScoreCode.sas file while scoring.
    • When you download score code from a Topics node, the resulting ZIP file contains the topics analytic store (TopicsModel.astore) and its associated score code (AstoreScoreCode.sas).
    • When you download score code from a Categories node, the resulting ZIP file contains the following: the categories model (CategoriesModel.mco) and its associated score code (ScoreCode.sas), and the categories analytic store (CategoriesModel.astore) and its associated score code (AstoreScoreCode.sas). MCO files are binary encoded concept files and is used by ScoreCode.sas file while scoring. The MCO files can also be processed by other SAS products (for example, SAS Text Miner).

The zip file will be saved on the client computer and location depends on the browser used. For example, Google chrome will save the zip file to the system download folder on client machine.

The ScoreCode.sas files contain SAS program for batch scoring. Whereas the Astore (.astore) and its associated score code (AstoreScoreCode.sas) file use Astore (Analytic Store) scoring. An analytic store is a binary file that contains information about the state of an analytic object. It stores information that enables you to load and restore the state of the analytic object and set it in a score-ready mode. The analytic store is transportable. That is, an ASTORE can be produced on one host and consumed on others without the need of traditional SAS export or import.

 

Scoring data in SAS Studio by using an Analytic Store (.astore) generated by Concepts node

Note that the score code files (.astore, .sas files) when downloaded gets saved on the client machine. However, you must save the .astore file from the downloaded score code in a location that is network-accessible by the CAS server.

Also, you need to move analytic store from local file system into CAS. Launch SAS Studio and submit following statements to send the analytic store from the local file system to the data table Public.ASRS_Concepts in CAS session.

 

/*Uploading Astore from Local File System*/
proc astore;
upload rstore=Public.ASRS_Concepts
store="/workshop/VTXT/ConceptsModel.astore";
quit;

The PROC ASTORE statement invokes the procedure and does not require any options.

The UPLOAD statement moves an analytic store from the local file system into a data table in CAS. The RSTORE =PUBLIC.ASRS_Concepts specifies the CAS table to which the store is sent. The STORE= /workshop/VTXT/ConceptsModel.astore specifies the full path of the valid store file that was created earlier by the analysis node and exists in the local file system. Once the astore file has been moved into CAS session, it is available for scoring. The following example shows how to score an input table by using the information in the analytic store.

The SCORE statement enables you to score models. DATA= PUBLIC.ASRS names the input data table for PROC ASTORE to use. The OUT= PUBLIC.Concepts_Scored specifies the output data table. The RSTORE=PUBLIC.ASRS_Concepts specifies the data table in CAS to contain the analytic store.

 

 /*Scoring table using ASTORE procedure*/ 
proc astore; 
score data=Public.ASRS out=Public.Concepts_Scored 
rstore=Public.ASRS_Concepts;
run;

The output table created resembles the following. Hover your mouse on any value under the first column to display the match text, the start and end byte positions in the original document and matched concepts.

 

ms_1_OutConcepts_ASRS-1024x403.png

Select any image to see a larger version.
Mobile users: To view the images, select the "Full" version at the bottom of the page.

 

You can use similar steps to score an external data using Topics and Categories models.

 

Using Astore Score code generated by Concepts node in Visual Text Analytics

Next you use the SAS code (AstoreScoreCode.sas) generated by the Concepts node to enable scoring, but before you submit the code you must specify inputs for some of the macro input fields.

This SAS score code file also make use of (.astore) file under the hood to score a text data.

When using the SAS score code generated by Concepts node for scoring, you need to specify the fully qualified path to access the .astore file. However, the SAS score code generated by Topics and Categories node has this path specified with a system generated value. Therefore, if you intend to use the same code for scoring in an environment other than the one used by SAS Visual Text Analytics to produce model then the value of this path will have to be modified (as shown in the example below). A similar modification is required in the “cas_server_hostname” value to match the hostname for the server where scoring is performed, because by default, the “cas_server_hostname” value be set automatically to the host for the associated SAS Visual Text Analytics project.

 

/*****************************************************************
* SAS Visual Text Analytics
* Concepts Astore Score Code
*
* Modify the following macro variables to match your needs.
*
* NOTE: The text variable on the input table must match the
* name and type of the text variable in the table that was used
* to create the analytic store (astore) table.
****************************************************************/
/* specifies CAS library information for the CAS table that you would like to score. You must modify the value to provide the name of the library that contains the table to be scored. */
%let input_caslib_name = "Public";
/* specifies the CAS table you would like to score. You must modify the value to provide the name of the input table, such as "MyTable". Do not include an extension. */
/* NOTE: The text variable on the input table must match the name and type of the text variable that was used when the astore was created. */
%let input_table_name = "ASRS";

/* specifies the variables in the input table to copy to the output tables. You must modify the value to specify variables that you want to copy to the output tables, such as "doc_id". Copying the document identifier will map the results to the input data. */
%let copy_vars_variables = "docID";

/* specifies the fully qualified path to the concepts model .astore file to upload for use in scoring. You must store the concepts model .astore file from the downloaded score code in a location that is network-accessible by the CAS server. You must modify the value of local_astore_file_path to provide the path to the .astore file, such as "/vta/scoring/{concepts model.astore file}". */
%let local_astore_file_path = "/workshop/VTXT/ASRS_ConceptsModel.astore";
/* After uploading the concepts model .astore file, you must specify a CAS library to write out the astore table to use in scoring. This library will be used in the CAS session during the ASTORE scoring action. You must modify the value to provide the name of the library that will contain the astore table. */
%let input_astore_caslib_name = "Public";

/* specifies the CAS astore table to use in scoring */
%let input_astore_name = "ASRS_Concepts_Astore";

/* specifies the CAS library to write the score output tables. You must modify the value to provide the name of the library that will contain the output tables that the score code produces. */
%let output_caslib_name = "Public";

/* specifies the concepts output CAS table to produce */
%let output_table_name = "ASRS_out_concept_astore_results";

/* specifies the hostname for the CAS server. This should be set automatically to the host for the associated SAS Visual Text Analytics project. */
%let cas_server_hostname = "sas-cas-server-default-client";

/* specifies the port for the CAS server. This should be set automatically to the host for the associated SAS Visual Text Analytics project. */
%let cas_server_port = 5570;

/* creates a session and a library reference */
cas mysess host=&cas_server_hostname port=&cas_server_port sessopts=(caslib=&input_astore_caslib_name);
libname mycas cas sessref=mysess datalimit=all;

/* uploads the analytic store to the CAS server */
%let input_astore_name_unquoted = %qsysfunc(dequote(&input_astore_name));
proc astore;
upload rstore=mycas.&input_astore_name_unquoted
store=&local_astore_file_path;
quit;
/* calls the scoring action */
proc cas;
session mysess;
loadactionset "astore";

action astore.score;
param
table={caslib=&input_caslib_name, name=&input_table_name}
rstore={name=&input_astore_name}
options={{name="extend_out_char_var_bytes", value=2048}}
copyVars={©_vars_variables}
casout={caslib=&output_caslib_name, name=&output_table_name, replace=TRUE}
;
run;
quit;

 

In the astore score code macro look for “{…….}” snippets and substitute values accordingly. For input_caslib_name specify “PUBLIC”, for input_cas_table_name specify “ASRS” and so on. Note the local_astore_file_path value. Then submit the modified score code and view results. The output table can be interpreted in the same way as we did for previous results table.

Having discussed above two approaches of scoring using astore file and SAS score code file, you must be wondering which one to use? Well, it depends.

 

Use astore file when deploying model using SAS Scoring Accelerator or in an environment other than SAS. This approach requires you to write your own code. However, if you are not a proficient coder you may want to use the SAS score code developed by GUI (SAS Visual Text Analytics in this case). Also, the SAS score code can be used for scoring only in the SAS environments.

 

To learn about building models using SAS Visual Text Analytics, you may want to register for the course SAS® Visual Text Analytics in SAS® Viya®.

Version history
Last update:
‎06-23-2023 04:50 AM
Updated by:
Contributors

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

Free course: Data Literacy Essentials

Data Literacy is for all, even absolute beginners. Jump on board with this free e-learning  and boost your career prospects.

Get Started

Article Tags