In the first article of this series on CASL programming in SAS Viya for Learners we discovered how to use SAS Studio to make a connection with the CAS server, assign libraries to explore the sample data, perform a simple analysis and terminate the CAS session.
In this article we will be looking at basic data manipulation of the CAS tables. It is important to mention here that in CASL, as in all other languages, there is often more than one way to perform a task. It isn’t possible to cover all possible methods in these articles so we will explore what I believe to be the simplest and clearest way to achieve our goals. In time, as you practice, read the documentation and learn more you will discover other techniques which may well, in the right circumstances, be improvements to those you will see here. However, these methods will work and will get you up and running with CASL programming.
Firstly, we will start our session and set up our libraries.
/* Connect to the cas server and call the connection casconn*/
cas casconn;
/* Assign SAS librefs for all existing caslibs */
caslib _all_ assign;
The data set we will use for this article is called “Accepts” and is held in the ACADEMICS library. One important thing to note is that in SAS Viya for Learners we don’t have write access to that library. This is, actually, a good thing as it is a global library used by others and in any case it is not generally good practice to alter the original copy of your source data. We do, however, have write access to the CASUSER library. We can see that with the caslibinfo action in the tables action set.
/* Get information about caslibs */
proc cas;
table.caslibinfo caslib="academic";
run;
table.caslibinfo caslib="casuser";
run;
quit;
You can see from the results that in contrast to the ACADEMICS library CASUSER is both personal and transient. This means that only you can access that data and any data held in it only has session scope i.e. it only exists for the lifetime of the session so you must rerun your analysis to regenerate it if you terminate and restart your session. Again, this is not a bad thing as it prevents mistakes if your data changes or you alter any of your code.
Unfortunately, there is no copy action currently available, but we can utilise a little trick to mimic one. If we execute an action which has a casout parameter (not all do hence the need to sometimes make a copy) but don’t specify the parameter which alters the table we can simply make a direct copy. It is easier to understand this with our example.
/* Make a copy of the table */
proc cas;
table.index
table={caslib="academic", name="accepts"}
casout={caslib="casuser", name="accepts_1", replace=true};
quit;
Here we have used the index action but haven’t used the indexVars parameter to specify variables to create indexes for. Therefore, a straight copy will be written to CASUSER.
If we only want to use a small number of the columns in a table, we can use the alterTable action with either the drop or keep parameter. As the names imply drop doesn’t copy the columns specified to the output table whereas keep only copies the specified columns. Which one you choose will largely depend on how many columns you want to drop or keep. It is simply easier to use the parameter which requires you to specify fewest columns. In our case we only want to keep three columns, so we’ll use keep. This can be confirmed with the columnInfo action.
/* Alter the table keeping only 3 columns */
proc cas;
table.altertable
caslib="casuser",
name="accepts_1",
keep={"app_id", "down_pyt", "loan_amt"};
quit;
/* Get information about the columns in the table */
proc cas;
table.columninfo
table={caslib="casuser", name="accepts_1"};
quit;
We may wish to create a new column in the table, possibly based on values from other columns. We can do this by creating what is known as a calculated column. We can do this by using the table action parameter to an action, along with the table actions “subparameters” computedVars (which names the new columns) and computedVarsProgram (which specifies the formulae for creating them). Again, this is probably clearer in our example.
/* Create a calculated column */
proc cas;
table.index
table={
caslib="casuser"
name="accepts_1"
computedVars={{name="perc"} }
computedVarsProgram="perc=(down_pyt/loan_amt)*100;"
}
casout={caslib="casuser", name="accepts_2", replace=true};
quit;
/* Get information about the columns in the table */
proc cas;
table.columninfo
table={caslib="casuser", name="accepts_2"};
quit;
This is quite simple – we merely use a where statement in the table parameter when we copy our table. Here we only want to keep rows where the value of the perc column is greater than zero.
proc cas;
table.index
table={caslib="casuser", name="accepts_2" where="perc>0"}
casout={caslib="casuser", name="accepts_3", replace=true};
quit;
If we want to alter values in the table, we can do so by using the set parameter in the update action. Here we want to multiply all the values in the app_id column by 10.
proc cas;
table.update
table={caslib="casuser", name="accepts_3"}
set={{var="app_id", value="app_id*10"}};
quit;
If we want to delete rows from a table we can use the deleteRows action specifying the condition to use in deciding whether to delete the rows. Here (after first creating a copy of the table as shown earlier) we delete all rows where the value of loan_amt is less than twenty thousand.
proc cas;
table.index
table={caslib="casuser", name="accepts_3"}
casout={caslib="casuser", name="accepts_4", replace=true};
quit;
proc cas;
table.deleterows
table={caslib="casuser", name="accepts_4", where="loan_amt<20000"};
quit;
Finally we terminate our session:
/* Terminate the session */
cas casconn terminate;
I would encourage you to try out some of the techniques shown in this article and if you have any questions or comments, please leave a message below.
how can i delete rows that are present in another dataset?
Hi @dipand and thanks for the question. You can use the same deleteRows action to remove rows from any CAS table that you have write access to. It's important to remember though that a CAS table is an in-memory structure and you are NOT deleting rows from the underlying data set (or any other file used to create the table).
I hope this answers your question - if not please let me know, preferably with a cncrete example.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Data Literacy is for all, even absolute beginners. Jump on board with this free e-learning and boost your career prospects.