I've previously demonstrated how to perform a data completeness check using the Filtering rule type in SAS Intelligent Decisioning. In this article, I will use the other rule type, Assignment, to flag data records that have an issue. I will create several of these rules and put them together in a SAS Intelligent Decisioning decision flow.
The scenario for my example is that each week I get a data set of product information which looks like this:
Select any image to see a larger version.
Mobile users: To view the images, select the "Full" version at the bottom of the page.
Each week I want to run the following checks on the data set to determine if there are any issues with any of its records. The checks that need to be performed are:
At the end of running these checks for the product data set I should have results like this:
In this article, I will review the high-level steps of how I achieved these results using SAS Intelligent Decisioning 5.3.
The first thing I need to do is create the individual rule sets that are needed. Instead of creating a Filtering rule type, I will create Assignment rule types since I want to keep all the records from the data set and simply flag whether they pass the check or not and if not, output a reason for the failure.
The Region_Code Check needs to compare the value of the column to a list of acceptable values. In SAS Intelligent Decisioning you can create a Lookup Table to maintain those values. To create a new lookup table, select the Lookup Tables tab and click New Lookup Table.
Enter a Name, Description, and Location for the lookup table and click Save.
Select New Entries to create to create the values. Enter a Key and Value for each entry. Note: The Key column is the one that will be used for comparison within the rule set. Once all the entries have been added click Save.
Finally, select Activate to create an active version of the table, so it can be used in rule sets.
For more information on Lookup Tables, refer to the documentation.
Now that I have created the necessary lookup table, I can create the Region Code Check rule set. Below is the rule set I created.
Below are the input and output variables for the rule set:
Below is the rule set for the Product_Code (East) Check:
Below are its variables:
Below is the rule set for the Product_Code (West) Check:
Below are its variables:
Below is the rule set for the Missing Values Checks. This rule set contains two rules – Contact Check and Product Name Check.
Note: I am concatenating the Failure_Reason if one exists from a previous check.
Below are its variables:
Now that I have my individual rule sets for my scenario, I can create my decision flow. To create a new decision flow, select the Decisions tab and click New Decision.
Enter a Name, Description, and Location for the decision and click Save.
To create a decision flow, right-click and select Add to add the element needed. For more information on creating a decision flow, refer to the documentation.
I string together my rule set checks in a logical flow. Below is the decision flow that supports the scenario I outlined at the beginning of this article.
Note: I added some conditions based on the Record_Check and Region_Code values to control the flow within the decision.
Its variables are picked up from the rule sets I added. You can adjust as needed.
The next part of the process is to test the newly created decision. To do that, I select the Scoring tab and select New Test.
For a test, you can enter a Name, Description, Location, and select a Data source to map the variables used in the rule set. There is also an Advanced setting where you can change the Output data library where the results of the test are saved.
To select my input Data source, I click for the Data source field. I select the Product_Data in-memory table in the Public library. You must select an in-memory table. If the table you want to use for the test isn't already loaded to memory, then click to load it.
I select OK to confirm my input data source selection.
Next, I select the Variables button to confirm the mapping of the input data column names to the variable names used in the decision. Since my input data source had column names that matched the variable names used, they have been automatically mapped.
I select Save to save the test selections.
To run the test with the selected variable mappings, I check the test and select Run.
Once the test run is complete, I can click to view its results.
Below are my results. The Record_Check and Failure_Reason results match what I expected from my scenario.
Now that I've completed my testing of the decision, I can publish it. I select the Publish button to open the Publish Decisions window.
I select the CAS publishing destination.
Note: Your SAS Administrator maintains the publishing destinations. For more information, refer to the documentation.
I can enter a Published Name for the decision, select whether or not to Replace if one already exists, and select whether or not I want to use Rule-Fired Tracking and Path Tracking. After making these selections are made, I select the Publish button.
Note: There are restrictions on the published names which depends on its destination. Refer to the documentation for more information.
I confirm the successful publication and select Close.
On the Scoring tab for the decision, I select the Publishing Validation section. A publishing validation test has been created by publishing the decision.
I click the hyperlinked publishing validation test name to view its properties. Its input data source is the same one I used in the decision tests. I select a Location for the validation test content.
I select Save to save the changes and close the properties. I check the publishing validation test and select Run.
By running a Publishing Validation Test for a decision flow, the code used to execute that decision flow is generated. I can use that code to create a program to execute the decision flow in SAS Studio. To view the code, I need to select to view the publishing validation test results.
I select the Code tab to view the code that was submitted.
I copy the code from line 5 through line 26 and paste it into SAS Studio V.
Before those lines I add my CAS session statement, create a section to add variables I plan to use in the program, and assign all CAS libraries to the session.
Note: Since I want to perform these checks weekly against data sets with the same layout, but different names I made the code more generic by using variables for the input and output CAS libraries and tables.
Next, I add a section of code to delete the in-memory and physical output tables if they exist.
Then, I add a proc cas; statement before the code I copied from SAS Intelligent Decisioning and change the hard-coded values to the variables I created earlier in the code.
Next, I add a section to create a physical output table.
Finally, I add code to terminate my CAS session.
Below is the full program:
cas mySession sessopts=(caslib="casuser" timeout=1800 locale="en_US");
/***** ASSIGN VARIABLES *****/
%let input_caslib=Public;
%let input_table=PRODUCT_DATA;
%let output_caslib=Public;
%let output_table=&input_table._CHECK_RESULTS;
/****** END VARIABLE ASSIGNMENT *****/
/* assign all CAS libraries to session */
caslib _all_ assign;
proc cas;
/* Drop output tables (in-memory and physical), if they exist */
table.dropTable /
caslib="&output_caslib."
name="&output_table."
quiet=true;
/***** May need to change the deleteSource depending on output caslib type ******/
table.deleteSource /
caslib="&output_caslib."
source="&output_table..sashdat"
removeAccessControls=true
quiet=true;
/* Execute Decision */
proc cas;
sessionProp.setSessOpt /
caslib="&output_caslib."; /* changed value to use variable */
table.tableExists result=e /
caslib="DM"
name="sas_model_table";
haveTable = dictionary(e, "exists");
if haveTable <= 0 then do;
table.loadTable /
caslib="DM"
path="sas_model_table.sashdat";
end;
ds2.runModel / /* changed values to use variables */
modelName="ProductDataCheck1_0"
table={caslib="&input_caslib.", name="&input_table."}
modelTable={caslib="DM", name="sas_model_table"}
casOut={caslib="&output_caslib.", name="&output_table."};
table.promote / /* changed values to use variables */
caslib="&output_caslib."
name="&output_table.";
/* save physical copy of output table - added this section */
table.save /
caslib="&output_caslib."
name="&output_table."
table={caslib="&output_caslib.", name="&output_table."}
replace=true;
run;
quit;
cas mySession terminate;
In my code I have set the input table to use the same Product_Data table in the Public CAS library I used in the testing of my decision; therefore, I should get the same results. I Submit the code and view the log to ensure there are no errors.
I go to SAS Data Explorer and view the output table created. I get my expected results.
In conclusion, this article shows you a way that you can perform record checks against a data set in SAS Viya using a SAS Intelligent Decisioning decision flow to flag any rows that have an issue.
For more information, please refer to the SAS Intelligent Decisioning documentation.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Data Literacy is for all, even absolute beginners. Jump on board with this free e-learning and boost your career prospects.