When I last looked at CAS programming, I mentioned that, while much more powerful than SAS, CAS still looked and felt like SAS. In that post, I was talking about DATA Step which is, of course, the heart of SAS programming. In this post, let's look at something completely new -- CAS Actions. In particular, let's focus on the COMPUTEDVARS and COMPUTEDVARSPROGRAM parameters which define the action's computed columns / computed variables.
So, even though the CAS Action paradigm is new, is it still grounded in the SAS concepts we all know and love? (hint, it is....)
CAS Actions are the finest grain way to call CAS. They instruct CAS to perform one and only one action, for example:
In many ways, CAS Actions are similar to SAS procedures (PROCs) in that both generally perform some well defined processing algorithm on input data and produce some kind of output, possibly table(s) and/or listing files (reports). However, SAS procedures also play a role in CAS as certain PROCs are CAS-enabled meaning that they trigger CAS actions to read and process CAS data. In general a CAS-enabled PROC will call one or more CAS actions to perform its CAS processing.
CAS Actions can be called from SAS (Proc CAS), Python, REST, as well as a few other languages. While everything we talk about in this post applies to all of these platforms, we'll use SAS here.
Calling CAS actions from SAS is done using PROC CAS while the language used inside PROC CAS is called CASL. Below is an example of the table.partition action.
proc cas;
table.partition /
table={name="mega_corp"
caslib="visual"
computedVars={{name="Margin"} {name="RemainingLife"}}
computedVarsProgram="Margin = Revenue - ExpensesMaterial;
RemainingLife = UnitLifespan - UnitLifespanLimit;"
}
casout={name="MGTest"
caslib="casuser"
replace=true
compress=false
};
run;
quit;
Within the PROC CAS block, the desired CAS action is stated (table.partition) and then various optional and required parameters for the action are set. Parameters are generally:
Commas between parameters and list items are optional and lists are enclosed in braces ({}).
As shown in the example above, we created two variables,
These are listed within the COMPUTEDVARS parameter. The only required attribute for a computed column within this list parameter is the name. However you can define many more attributes for your new column including format, label, length, and precision.
Each variable created in COMPUTEDVARS must be defined in the COMPUTEDVARSPROGRAM parameter. This parameter must be set to a string containing DATA Step code. Generally, the computed variables will be defined using assignment statements. If-Then-Else blocks are also allowed. So conditional logic is possible.
As in the example above, we can use this functionality to create new columns from calculations on other columns (as well as constants, macro variables, etc.). Any CAS action that reads a CAS table and writes it to a new location (Partition, Index, Shuffle, etc.) can be used to do this.
In the example we used the partition action with no GROUP BY instructions so that it would simply write out a copy of the input CAS table with the new columns included. The new columns are calculated by the CAS action and materialized on the target CAS table.
While you can use computed variables to create new materialized columns on a target CAS table, you can also use them to perform one-off calculations for a specific analysis. For example, you might create new columns for a more advanced summary analysis as shown below.
proc cas;
simple.summary /
table={name="mega_corp"
caslib="visual"
computedVars={{name="Margin"} {name="RemainingLife"}}
computedVarsProgram="Margin = Revenue - ExpensesMaterial;
RemainingLife = UnitLifespan - UnitLifespanLimit;"
groupBy={"unit"}
}
subset={"mean"}
inputs={"profit" "revenue" "Margin" "RemainingLife"};
run;
quit;
In this example, the same two fields as before, margin and remainingLife, are created but they will not be stored anywhere. They simply exist for the life of the CAS action and are only used in enhancing the output summary report.
NB: Visual Analytics Calculated Items actually manifest using CAS computed variables as shown in this use case. So they are not stored with the data. Their logic is simply applied when the CAS action for the VA report is run.
NB2: Visual Analytics Aggregated Measures do not manifest as CAS computed variables. Aggregated measures are generally computed after the CAS actions have completed.
CAS computed columns can also be defined when creating a CAS view. While the partition action (or any other action that creates an output CAS table) will materialize any computedVars variables, the table.View action will create a virtual table that references both the materialized columns from the input CAS table and combines them with the calculated columns (computedVars) from the view definition only at query time. An example is below.
proc cas;
table.view /
name="virtualMega_corp"
tables={{name="mega_corp"
caslib="visual"
computedVars={{name="Margin"} {name="RemainingLife"}}
computedVarsProgram="Margin = Revenue - ExpensesMaterial;
RemainingLife = UnitLifespan - UnitLifespanLimit;"
}};
run;
quit;
CAS Views have several advantages over fully materialized CAS tables in that they can save space since some calculated columns can be quite large. Also they offer advantages over temporary "on-the-fly" calculated columns in that the logic only needs to be created once.
Read my previous article to learn more about CAS views.
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Data Literacy is for all, even absolute beginners. Jump on board with this free e-learning and boost your career prospects.