BASE SAS is the foundational programming language and environment for SAS software. It's been around for decades, and it's a real workhorse when it comes to data processing, statistical analysis, and reporting.
Whereas, CASL(Cloud Analytic Services Language) is one such innovation, offering a modern approach to data analytics. It is a cloud-native programming language designed specifically for the SAS Viya platform.
In this article, we will perform a comparative analysis of these two languages, highlighting their strengths, capabilities, and suitability for various data analytics tasks through demonstrating simple practical examples.
It's the fundamental software in the SAS suite, providing essential tools for managing, analysing, and reporting data. With BASE SAS, users can manipulate and transform data, conduct statistical analyses, and generate reports to extract valuable insights using its rich set of BASE SAS procedures.
Although BASE SAS is a robust and versatile tool, it may have limitations when dealing with large datasets or real-time processing requirements.
CASL, on the other hand, represents a modern approach to data analytics, leveraging parallel and distributed computing for faster processing. CASL is all about leveraging distributed computing and in-memory processing to handle large datasets and deliver real-time insights at lightning speed.
It complements traditional analytics tools like BASE SAS and enables programmers to handle big data and optimise complex analytics workflows.
CASL is certainly not the only way to instruct CAS but it’s definitely a powerful option. Stephen Foerster aptly described CASL as it's like BASE SAS with the MACRO built-in.
This analysis helps you understand the nuances of BASE SAS and CASL, empowering you to choose the right toolset based on your specific analytical needs and infrastructure requirements. This comparative analysis serves as a guide to navigate the strengths and capabilities of both, finding the optimal solution to meet diverse analytical challenges and business objectives.
In summary, while BASE SAS and CASL share similarities in basic syntax, CASL is specifically designed for the cloud-based, distributed computing environment of SAS Viya. It offers enhanced performance for large datasets, built-in conditional logic, and greater flexibility in terms of language integration and analytics lifecycle support.
This side-by-side code comparison highlights the differences in functionality and usage between BASE SAS and CASL, showcasing their unique strengths and applications. The given examples demonstrate both approaches to performing the same task but in different ways.
We've covered a total of six different examples to showcase the differences. I highly recommend reviewing all the examples and their outputs, but feel free to focus on the one that interests you the most.
Here's a quick overview:
SAS vs. CASL #1: Import External Files
SAS vs. CASL #2: Load Datasets
SAS vs. CASL #3: Print Sample Data Values
SAS vs. CASL #4: Display Dataset Summary
SAS vs. CASL #5: Data Handling (Filtering, Grouping, and Sorting)
SAS vs. CASL #6: Generate Descriptive Statistics
Loading a CSV File in SAS
The following code demonstrates how to load a CSV file directly into SAS. It begins by defining a file reference (reffile) pointing to the CSV file located at the specified path. The proc import procedure is then used to read the CSV file.
The datafile parameter specifies the file to be imported, dbms=csv indicates that the file format is CSV, and out=work.hmeq_imported specifies the output dataset's name and location in the work library. The getnames=Yes option tells SAS to use the first row of the CSV file as the variable names.
/* load file directly into sas */
filename reffile "/pb/Users/MayurJadhav/Files/hmeq.csv";
proc import datafile=reffile
dbms=csv
out=work.hmeq_imported;
getnames=Yes;
run;
Output Log:
NOTE: The infile REFFILE is:
Filename=/pb/Users/MayurJadhav/Files/hmeq.csv,
Owner Name=UNKNOWN,Group Name=UNKNOWN,
Access Permission=-rw-r--r--,
Last Modified=05Jul2024:21:06:17,
File Size (bytes)=438194
NOTE: 5960 records were read from the infile REFFILE.
The minimum record length was 21.
The maximum record length was 83.
NOTE: The data set WORK.HMEQ_IMPORTED has 5960 observations and 13 variables.
NOTE: DATA statement used (Total process time):
real time 0.01 seconds
cpu time 0.01 seconds
5960 rows created in WORK.HMEQ_IMPORTED from REFFILE.
Loading a CSV File in CASL
The following code shows how to load the same CSV file into CAS (Cloud Analytic Services) in the SAS Viya environment. It starts a CAS session named casauto and then uploads the CSV file using the upload statement.
The path= parameter specifies the location of the CSV file, and the importoptions parameter indicates that the file type is CSV and that the first row contains the variable names (getNames=True). The casout parameter specifies that the imported data should be stored in a CAS table named "hmeq_in_cas", and the replace=True option ensures that any existing table with the same name will be replaced. Finally, the table.tableInfo action statement is used to display information about the CAS table.
/* load file directly into CAS */
proc cas;
session casauto;
upload path="/pb/Users/MayurJadhav/Files/hmeq.csv"
importoptions={filetype="CSV" getNames=True}
casout={
name="hmeq_in_cas"
replace=True
}
;
run;
table.tableInfo; /* shows information about a table */
run;
Output Log:
NOTE: Active Session now casauto.
NOTE: Cloud Analytic Services made the uploaded file available as table HMEQ_IN_CAS in caslib CASUSER(MayurJadhav).
NOTE: The table HMEQ_IN_CAS has been created in caslib CASUSER(MayurJadhav) from binary data uploaded to Cloud Analytic
Services.
{caslib=CASUSER(MayurJadhav),tableName=HMEQ_IN_CAS}
91
92 table.tableInfo; /* shows information about a table */
93 run;
94
Loading SAS Datasets into the SAS Work Library
The below example demonstrates how to load SAS datasets from the sashelp library into the work library, effectively creating copies of the datasets for temporary use. This allows you to work with these copies without altering the original data.
We’ll continue to use the datasets stored in the WORK library throughout this article to demonstrate the distinctions between BASE SAS and CASL across various examples.
/* load sas datasets directly into sas work lib */
data work.class; set sashelp.class;
run;
data work.cars; set sashelp.cars;
run;
data work.iris; set sashelp.iris;
run;
Output Log:
80 /* load sas datasets directly into sas work lib */
81
82 data work.class; set sashelp.class;
83 run;
NOTE: There were 19 observations read from the data set SASHELP.CLASS.
NOTE: The data set WORK.CLASS has 19 observations and 5 variables.
NOTE: DATA statement used (Total process time):
real time 0.00 seconds
cpu time 0.00 seconds
84 data work.cars; set sashelp.cars;
85 run;
NOTE: There were 428 observations read from the data set SASHELP.CARS.
NOTE: The data set WORK.CARS has 428 observations and 15 variables.
NOTE: DATA statement used (Total process time):
real time 0.00 seconds
cpu time 0.00 seconds
86 data work.iris; set sashelp.iris;
87 run;
NOTE: There were 150 observations read from the data set SASHELP.IRIS.
NOTE: The data set WORK.IRIS has 150 observations and 5 variables.
NOTE: DATA statement used (Total process time):
real time 0.00 seconds
cpu time 0.00 seconds
Loading SAS Datasets into the CAS Library
Similar to the previous example where copies of datasets are created in the work library, this code loads built-in datasets into the CAS (Cloud Analytic Services) environment for processing.
It connects to the CAS session named casauto and then loads three datasets, replacing any existing versions in CAS. You can find these three datasets with the same name but under CASUSER CAS library. This allows you to work with these datasets in the CAS environment, which is designed for high-performance, in-memory processing.
/* load sas datasets directly into CAS lib */
proc casutil;
cas casauto;
load data=sashelp.class replace;
load data=sashelp.cars replace;
load data=sashelp.iris replace;
run;
Output Log:
80 /* load sas datasets directly into CAS lib */
81 proc casutil;
NOTE: The UUID '105b1d4a-836b-3448-91a7-a1b5d0da05c2' is connected using session CASAUTO.
82 cas casauto;
WARNING: A session with the name CASAUTO already exists.
83 load data=sashelp.class replace;
NOTE: SASHELP.CLASS was successfully added to the "CASUSER(MayurJadhav)" caslib as "CLASS".
84 load data=sashelp.cars replace;
NOTE: SASHELP.CARS was successfully added to the "CASUSER(MayurJadhav)" caslib as "CARS".
85 load data=sashelp.iris replace;
NOTE: SASHELP.IRIS was successfully added to the "CASUSER(MayurJadhav)" caslib as "IRIS".
86 run;
Printing Sample Data Values in BASE SAS
This code prints the first 10 rows of a dataset named "class" from the "work" library, which we loaded in the previous example. It shows the values for the columns listed after the VAR statement: name, sex, age, height, and weight.
/* print sample data values */
proc print data=work.class (obs=10);
var name sex age height weight;
run;
Results:
Printing Sample Data Values in CASL
This code prints the first 10 rows of a dataset named "class" in the CAS (Cloud Analytic Services) environment from the CASUSER CAS library.
It connects to the CAS session named casauto, and then uses the table.fetch action to retrieve the values for the columns: name, sex, age, height, and weight. The to=10 option specifies that only the first 10 rows should be displayed. The output of this will be printed on the “RESULTS” tab.
/* print sample data values in CASL */
proc cas;
session casauto;
table.fetch /
format=true,
fetchvars = {"name", "sex", "age", "height", "weight"},
table="class",
to=10;
run;
quit;
Results:
Displaying a Summary of Dataset Contents in BASE SAS
The traditional BASE SAS proc contents procedure provides detailed structural and metadata information about a dataset named "class" located in the "work" library. It includes information such as variable names (columns), their types (numeric or character), and additional attributes like format and length.
This procedure provides an in-depth overview of the dataset's layout and characteristics, focusing solely on its structure and metadata, without displaying the actual data values contained within the dataset.
/* display table contents */
proc contents data=work.class;
run;
Results:
Displaying a Summary of Dataset Contents in CASL
To get the details of CAS dataset contents you need to use several CAS table actions such as caslibInfo, columninfo, recordCount, tableDetails, etc.
Retrieve CAS Table Information:
/* display CAS table contents */
proc cas;
session casauto;
table.caslibInfo;
table.columninfo / table="class";
table.recordCount / table="class";
table.tableDetails / table="class";
table.tableInfo / table="class";
run;
quit;
Results:
Filtering, Grouping, and Sorting Variables in BASE SAS
The following code shows the basic data handling operations such as filtering the data, grouping and sorting data based on selected variables.
This code retrieves specific columns (name, sex, age, height, weight) from the "class" dataset, filters it to include only females (sex="F"), groups the data by "name" and "age", and then sorts the grouped data in descending order by "name" and "age". It demonstrates how PROC SQL procedure are used in the data handling tasks like filtering, grouping, and sorting within SAS.
/* data handling: filtering, grouping, and sorting by variables */
proc sql outobs=10;
select name, sex, age, height, weight from work.class
where sex="F"
group by name, age
order by name desc, age desc;
quit;
Results:
Filtering, Grouping, and Sorting Variables in CASL
Similar to the proc sql procedure in traditional SAS, in CAS you can use the table.fetch action to perform a wide range of data handling operations, from basic to advanced tasks. This action allows you to retrieve data from CAS tables based on specified criteria, filter results, aggregate data, sort rows, and limit the number of returned rows, among other functionalities.
The below code connects to CAS, specifies conditions and variables to fetch from the "class" table, retrieves data where sex is female, sorts it by name and age in descending order, fetches the top 10 rows, describes the fetched result, and prints the data.
The following code snippet demonstrates its usage:
/* data handling in CASL: filtering, grouping, and sorting by variables */
proc cas;
session casauto;
classtbl.name ="class";
classtbl.where = "sex = 'F'";
fvars = {"name", "sex", "age", "height", "weight"};
table.fetch result=r_var/ /* results of the fetch action are saved in the "r_var" variable */
format=false,
fetchvars = fvars,
index=false,
sortby={
{name="name", order="descending"},
{name="age", order="descending"}
},
table=classtbl,
to=10;
describe r_var;
print r_var;
run;
quit;
Results:
Calculating Descriptive Statistics in BASE SAS
This code generates descriptive statistics for a dataset named "class" and organizes the results by the "sex" variable. First, it sorts the dataset by gender and then calculates descriptive statistics such as minimum, maximum, mean, standard deviation, etc for each gender group. The results are saved in a new dataset named work.summary_stats.
/* Generate descriptive statistics */
proc sort data=class out=classbysex;
by sex;
run;
proc means data=classbysex max mean min n nmiss std stderr;
by sex;
output out=summary_stats
;
run;
Results:
Calculating Descriptive Statistics in CASL
You can generate the same descriptive statistics from the CAS table using the simple.summary CAS action. It generates descriptive statistics for numeric variables such as the sample mean, sample variance, sample size, sum of squares, and more.
The following code generates descriptive statistics for a dataset named "class" in the CAS environment, organizing the results by the "sex" variable. The simple.summary action calculates various descriptive statistics mentioned after subSet= option , including maximum, mean, minimum, count, number of missing values, standard deviation, and standard error for each gender group.
/* Generate descriptive statistics in CASL*/
proc cas;
tbl1.name = "class";
tbl1.groupBy = "sex";
simple.summary /
table = tbl1
subSet = {"MAX", "MEAN", "MIN", "N", "NMISS", "STD", "STDERR"};
run;
quit;
Results:
In conclusion, this comparative analysis between BASE SAS and CASL provides a comprehensive exploration of their respective strengths and applications in data analytics. I hope this head-to-head comparison with demonstrated examples would help you select the appropriate toolset based on specific analytical needs and infrastructure requirements.
This article guide you how to transform your BASE SAS code into CASL but you could learn more about “When to CASL and not to CASL, SAS programming in SAS Viya”.
Whether you choose to leverage the advanced capabilities of CASL or retain some functionalities of BASE SAS, this choice will greatly impact how well you can understand and use your data to gain valuable insights.
References:
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Data Literacy is for all, even absolute beginners. Jump on board with this free e-learning and boost your career prospects.