BookmarkSubscribeRSS Feed

BASE SAS vs CASL: A Comparative Analysis That Will Help You in Code Comparison and Transition

Started ‎07-15-2024 by
Modified ‎07-26-2024 by
Views 1,106

BASE SAS is the foundational programming language and environment for SAS software. It's been around for decades, and it's a real workhorse when it comes to data processing, statistical analysis, and reporting. 

 

Whereas, CASL(Cloud Analytic Services Language) is one such innovation, offering a modern approach to data analytics. It is a cloud-native programming language designed specifically for the SAS Viya platform.

 

In this article, we will perform a comparative analysis of these two languages, highlighting their strengths, capabilities, and suitability for various data analytics tasks through demonstrating simple practical examples.

 

BASE SAS

It's the fundamental software in the SAS suite, providing essential tools for managing, analysing, and reporting data. With BASE SAS, users can manipulate and transform data, conduct statistical analyses, and generate reports to extract valuable insights using its rich set of BASE SAS procedures.

 

Although BASE SAS is a robust and versatile tool, it may have limitations when dealing with large datasets or real-time processing requirements.

 

CASL (Cloud Analytic Services Language)

CASL, on the other hand, represents a modern approach to data analytics, leveraging parallel and distributed computing for faster processing. CASL is all about leveraging distributed computing and in-memory processing to handle large datasets and deliver real-time insights at lightning speed.

 

It complements traditional analytics tools like BASE SAS and enables programmers to handle big data and optimise complex analytics workflows.

 

CASL is certainly not the only way to instruct CAS but it’s definitely a powerful option.  Stephen Foerster aptly described CASL as it's like BASE SAS with the MACRO built-in.

 

BASE SAS vs CASL: Comparative Analysis

 

This analysis helps you understand the nuances of BASE SAS and CASL, empowering you to choose the right toolset based on your specific analytical needs and infrastructure requirements. This comparative analysis serves as a guide to navigate the strengths and capabilities of both, finding the optimal solution to meet diverse analytical challenges and business objectives.

 

  1. Language Structure:
    • BASE SAS: Combines DATA step with SAS Procedures.
    • CASL: Statement-based scripting language that is case insensitive and executes CAS actions.
  2. Processing Engine:
    • BASE SAS: Runs on traditional SAS server.
    • CASL: Interacts with SAS Cloud Analytic Services (CAS), enabling distributed computing.
  3. Data Handling:
    • BASE SAS: Processes data sequentially.
    • CASL: Supports in-memory processing, allowing for faster handling of big data.
  4. Conditional Logic:
    • BASE SAS: Requires SAS/MACRO for complex conditional logic.
    • CASL: Has built-in conditional logic capabilities, similar to having MACRO functionality integrated.
  5. Procedure Execution:
    • BASE SAS: Uses PROC statements directly.
    • CASL: Uses CAS actions instead of procedures, though these actions often correspond to CAS-enabled PROCs.
  6. Data Access:
    • BASE SAS: Primarily works with local data and data stored in SAS datasets on the server. It is well-suited for traditional data processing and analysis tasks.
    • CASL: Designed to access and manipulate data in CAS tables, which allows for distributed data processing.
  7. Scalability:
    • BASE SAS: Limited by single-machine processing.
    • CASL: Designed for scalable, distributed computing environments.
  8. Language Integration:
    • BASE SAS: Primarily SAS-centric.
    • CASL: Can be accessed via multiple interfaces including SAS, Python, R, Java, and REST APIs.
  9. Analytics Lifecycle Support:
    • BASE SAS: Supports traditional analytics workflows.
    • CASL: Designed to support the entire analytical lifecycle, including data management, analytics, and scoring.
  10. Performance:
    • BASE SAS: Efficient for traditional data processing tasks.
    • CASL: Optimized for high-performance analytics, especially with large datasets.
  11. Code Generation:
    • BASE SAS: Often requires macro programming for dynamic code generation.
    • CASL: Offers more flexible options for dynamic code generation and execution.

 

In summary, while BASE SAS and CASL share similarities in basic syntax, CASL is specifically designed for the cloud-based, distributed computing environment of SAS Viya. It offers enhanced performance for large datasets, built-in conditional logic, and greater flexibility in terms of language integration and analytics lifecycle support.

 

BASE SAS vs CASL: Code Comparison and Transition

 

This side-by-side code comparison highlights the differences in functionality and usage between BASE SAS and CASL, showcasing their unique strengths and applications. The given examples demonstrate both approaches to performing the same task but in different ways.

 

We've covered a total of six different examples to showcase the differences. I highly recommend reviewing all the examples and their outputs, but feel free to focus on the one that interests you the most.

 

Here's a quick overview:

 

SAS vs. CASL #1: Import External Files

  • Loading a CSV File in BASE SAS
  • Loading a CSV File in CASL

SAS vs. CASL #2: Load Datasets

  • Loading SAS Datasets into the SAS Work Library
  • Loading SAS Datasets into the CAS Library

SAS vs. CASL #3: Print Sample Data Values

  • Printing Sample Data Values in BASE SAS
  • Printing Sample Data Values in CASL

SAS vs. CASL #4: Display Dataset Summary

  • Displaying a Summary of Dataset Contents in BASE SAS
  • Displaying a Summary of Dataset Contents in CASL

SAS vs. CASL #5: Data Handling (Filtering, Grouping, and Sorting)

  • Filtering, Grouping, and Sorting Variables in BASE SAS
  • Filtering, Grouping, and Sorting Variables in CASL

SAS vs. CASL #6: Generate Descriptive Statistics

  • Calculating Descriptive Statistics in BASE SAS
  • Calculating Descriptive Statistics in CASL

 

SAS vs. CASL #1: Import External Files

 

Loading a CSV File in SAS

The following code demonstrates how to load a CSV file directly into SAS. It begins by defining a file reference (reffile) pointing to the CSV file located at the specified path. The proc import procedure is then used to read the CSV file.

 

The datafile parameter specifies the file to be imported, dbms=csv indicates that the file format is CSV, and out=work.hmeq_imported specifies the output dataset's name and location in the work library. The getnames=Yes option tells SAS to use the first row of the CSV file as the variable names.

 

/* load file directly into sas  */
filename reffile "/pb/Users/MayurJadhav/Files/hmeq.csv";
proc import datafile=reffile
	dbms=csv
	out=work.hmeq_imported;
	getnames=Yes;
run;

 

Output Log:

NOTE: The infile REFFILE is:
      Filename=/pb/Users/MayurJadhav/Files/hmeq.csv,
      Owner Name=UNKNOWN,Group Name=UNKNOWN,
      Access Permission=-rw-r--r--,
      Last Modified=05Jul2024:21:06:17,
      File Size (bytes)=438194
NOTE: 5960 records were read from the infile REFFILE.
      The minimum record length was 21.
      The maximum record length was 83.
NOTE: The data set WORK.HMEQ_IMPORTED has 5960 observations and 13 variables.
NOTE: DATA statement used (Total process time):
      real time           0.01 seconds
      cpu time            0.01 seconds
      
5960 rows created in WORK.HMEQ_IMPORTED from REFFILE.

 

Loading a CSV File in CASL

The following code shows how to load the same CSV file into CAS (Cloud Analytic Services) in the SAS Viya environment. It starts a CAS session named casauto and then uploads the CSV file using the upload statement.

 

The path= parameter specifies the location of the CSV file, and the importoptions parameter indicates that the file type is CSV and that the first row contains the variable names (getNames=True). The casout parameter specifies that the imported data should be stored in a CAS table named "hmeq_in_cas", and the replace=True option ensures that any existing table with the same name will be replaced. Finally, the table.tableInfo action statement is used to display information about the CAS table.

 

/* load file directly into CAS  */
proc cas;
	session casauto;
	upload path="/pb/Users/MayurJadhav/Files/hmeq.csv"
	importoptions={filetype="CSV" getNames=True} 
	casout={
		name="hmeq_in_cas"
		replace=True
		}
;
run;

table.tableInfo;  /* shows information about a table */
run;

 

 

Output Log:

NOTE: Active Session now casauto.
NOTE: Cloud Analytic Services made the uploaded file available as table HMEQ_IN_CAS in caslib CASUSER(MayurJadhav).
NOTE: The table HMEQ_IN_CAS has been created in caslib CASUSER(MayurJadhav) from binary data uploaded to Cloud Analytic 
      Services.
{caslib=CASUSER(MayurJadhav),tableName=HMEQ_IN_CAS}
91    
92    table.tableInfo;  /* shows information about a table */
93    run;
94    
 
Results:
MayurJadhav_0-1721047439727.png

 

 

SAS vs. CASL #2: Load Datasets 

 

Loading SAS Datasets into the SAS Work Library

The below example demonstrates how to load SAS datasets from the sashelp library into the work library, effectively creating copies of the datasets for temporary use. This allows you to work with these copies without altering the original data.

 

We’ll continue to use the datasets stored in the WORK library throughout this article to demonstrate the distinctions between BASE SAS and CASL across various examples.

 

/* load sas datasets directly into sas work lib */

data work.class; set sashelp.class;
run;
data work.cars; set sashelp.cars;
run;
data work.iris; set sashelp.iris;
run;

 

 Output Log:

80    /* load sas datasets directly into sas work lib */
81    
82    data work.class; set sashelp.class;
83    run;
NOTE: There were 19 observations read from the data set SASHELP.CLASS.
NOTE: The data set WORK.CLASS has 19 observations and 5 variables.
NOTE: DATA statement used (Total process time):
      real time           0.00 seconds
      cpu time            0.00 seconds
      
84    data work.cars; set sashelp.cars;
85    run;
NOTE: There were 428 observations read from the data set SASHELP.CARS.
NOTE: The data set WORK.CARS has 428 observations and 15 variables.
NOTE: DATA statement used (Total process time):
      real time           0.00 seconds
      cpu time            0.00 seconds
      
86    data work.iris; set sashelp.iris;
87    run;
NOTE: There were 150 observations read from the data set SASHELP.IRIS.
NOTE: The data set WORK.IRIS has 150 observations and 5 variables.
NOTE: DATA statement used (Total process time):
      real time           0.00 seconds
      cpu time            0.00 seconds

 

Loading SAS Datasets into the CAS Library

Similar to the previous example where copies of datasets are created in the work library, this code loads built-in datasets into the CAS (Cloud Analytic Services) environment for processing.

 

It connects to the CAS session named casauto and then loads three datasets, replacing any existing versions in CAS. You can find these three datasets with the same name but under CASUSER CAS library. This allows you to work with these datasets in the CAS environment, which is designed for high-performance, in-memory processing.

 

/* load sas datasets directly into CAS lib */
proc casutil;
cas casauto;
  load data=sashelp.class replace;
  load data=sashelp.cars replace;
  load data=sashelp.iris replace;
run;

 

Output Log:

80    /* load sas datasets directly into CAS lib */
81    proc casutil;
NOTE: The UUID '105b1d4a-836b-3448-91a7-a1b5d0da05c2' is connected using session CASAUTO.
82    cas casauto;
WARNING: A session with the name CASAUTO already exists.
83      load data=sashelp.class replace;
NOTE: SASHELP.CLASS was successfully added to the "CASUSER(MayurJadhav)" caslib as "CLASS".
84      load data=sashelp.cars replace;
NOTE: SASHELP.CARS was successfully added to the "CASUSER(MayurJadhav)" caslib as "CARS".
85      load data=sashelp.iris replace;
NOTE: SASHELP.IRIS was successfully added to the "CASUSER(MayurJadhav)" caslib as "IRIS".
86    run;
MayurJadhav_1-1721047986902.png

 

 

 

SAS vs. CASL #3: Print Sample Data Values

 

Printing Sample Data Values in BASE SAS

This code prints the first 10 rows of a dataset named "class" from the "work" library, which we loaded in the previous example. It shows the values for the columns listed after the VAR statement: name, sex, age, height, and weight.


/* print sample data values */

proc print data=work.class (obs=10);
	var name sex age height weight;
run;

 

Results:

MayurJadhav_0-1721048290169.png

 

 

Printing Sample Data Values in CASL

This code prints the first 10 rows of a dataset named "class" in the CAS (Cloud Analytic Services) environment from the CASUSER CAS library.

 

It connects to the CAS session named casauto, and then uses the table.fetch action to retrieve the values for the columns: name, sex, age, height, and weight. The to=10 option specifies that only the first 10 rows should be displayed. The output of this will be printed on the “RESULTS” tab.

 

 

/* print sample data values in CASL */
proc cas;
  session casauto;

  table.fetch / 
    format=true,
	fetchvars = {"name", "sex", "age", "height", "weight"},
    table="class",
    to=10;
run; 
quit;

 

Results:

MayurJadhav_1-1721048473519.png

 

 

SAS vs. CASL #4: Display Dataset Summary

 

Displaying a Summary of Dataset Contents in BASE SAS

The traditional BASE SAS proc contents procedure provides detailed structural and metadata information about a dataset named "class" located in the "work" library. It includes information such as variable names (columns), their types (numeric or character), and additional attributes like format and length. 

 

This procedure provides an in-depth overview of the dataset's layout and characteristics, focusing solely on its structure and metadata, without displaying the actual data values contained within the dataset.

 

/* display table contents */
proc contents data=work.class;
run;

 

Results:

MayurJadhav_2-1721048821466.png

 

 

 

Displaying a Summary of Dataset Contents in CASL

 

To get the details of CAS dataset contents you need to use several CAS table actions such as caslibInfo, columninfo, recordCount, tableDetails, etc.

 

Retrieve CAS Table Information:

  • table.caslibInfo;: Displays information about the CAS libraries available in the session.
  • table.columninfo / table="class";: Provides details about the columns (variables) in the "class" table, such as their names, types, and lengths.
  • table.recordCount / table="class";: Shows the number of records (rows) in the "class" table.
  • table.tableDetails / table="class";: Gives comprehensive details about the structure and attributes of the "class" table, including metadata.
  • table.tableInfo / table="class";: Offers general information about the "class" table, such as its name, location, and description.

 

/* display CAS table contents */
proc cas;
  session casauto;

  table.caslibInfo;
  table.columninfo / table="class";
  table.recordCount / table="class";
  table.tableDetails / table="class";
  table.tableInfo / table="class";
run;
quit;

 

Results:

MayurJadhav_3-1721049347763.png

 

 

 

SAS vs. CASL #5: Data Handling (Filtering, Grouping, and Sorting)

 

Filtering, Grouping, and Sorting Variables in BASE SAS

The following code shows the basic data handling operations such as filtering the data, grouping and sorting data based on selected variables. 


This code retrieves specific columns (name, sex, age, height, weight) from the "class" dataset, filters it to include only females (sex="F"), groups the data by "name" and "age", and then sorts the grouped data in descending order by "name" and "age". It demonstrates how PROC SQL procedure are used in the data handling tasks like filtering, grouping, and sorting within SAS.

 

/* data handling: filtering, grouping, and sorting by variables */

proc sql outobs=10;
	select name, sex, age, height, weight from work.class
	where sex="F"
	group by name, age
	order by name desc, age desc;
quit;

 

Results:

MayurJadhav_4-1721049614005.png

 

 

Filtering, Grouping, and Sorting Variables in CASL

Similar to the proc sql procedure in traditional SAS, in CAS you can use the table.fetch action to perform a wide range of data handling operations, from basic to advanced tasks. This action allows you to retrieve data from CAS tables based on specified criteria, filter results, aggregate data, sort rows, and limit the number of returned rows, among other functionalities.

 

The below code connects to CAS, specifies conditions and variables to fetch from the "class" table, retrieves data where sex is female, sorts it by name and age in descending order, fetches the top 10 rows, describes the fetched result, and prints the data.

 

The following code snippet demonstrates its usage:

  • classtbl.name ="class"; specifies that the table to be queried is named "class".
  • classtbl.where = "sex = 'F'"; sets a condition to retrieve rows where the sex column equals 'F' (indicating females).
  • table.fetch result=r_var/ ... ; performs the fetch action to retrieve data and result saved in the r_var variable.
  • table=classtbl, specifies the table from which data is fetched.
  • to=10; limits the fetch to the first 10 rows.
  • describe r_var; provides a description of the fetched result (r_var).
  • print r_var; prints the fetched data (r_var).

 

 

/* data handling in CASL: filtering, grouping, and sorting by variables */
proc cas;
  session casauto;
  classtbl.name  ="class";                                           
  classtbl.where = "sex = 'F'";
  fvars = {"name", "sex", "age", "height", "weight"};

  table.fetch  result=r_var/   /* results of the fetch action are saved in the "r_var" variable */
    format=false,
    fetchvars = fvars,	
	index=false, 
    sortby={
      {name="name", order="descending"},                  
      {name="age", order="descending"}
    },
    table=classtbl,
    to=10;
describe r_var; 
print r_var;
run; 
quit;

 

Results:

MayurJadhav_5-1721049861783.png

 

 

 

SAS vs. CASL #6: Generate Descriptive Statistics

 

Calculating Descriptive Statistics in BASE SAS

This code generates descriptive statistics for a dataset named "class" and organizes the results by the "sex" variable. First, it sorts the dataset by gender and then calculates descriptive statistics such as minimum, maximum, mean, standard deviation, etc for each gender group. The results are saved in a new dataset named work.summary_stats.

 

/* Generate descriptive statistics */
proc sort data=class out=classbysex;
   by sex;
run;
proc means data=classbysex max mean min n nmiss std stderr;
	 by sex;
     output out=summary_stats
     ;
run;

 

Results:

MayurJadhav_6-1721050431677.png

 

 

Calculating Descriptive Statistics in CASL

You can generate the same descriptive statistics from the CAS table using the simple.summary CAS action. It generates descriptive statistics for numeric variables such as the sample mean, sample variance, sample size, sum of squares, and more. 

 

The following code generates descriptive statistics for a dataset named "class" in the CAS environment, organizing the results by the "sex" variable. The simple.summary action calculates various descriptive statistics mentioned after subSet= option , including maximum, mean, minimum, count, number of missing values, standard deviation, and standard error for each gender group.

 

/* Generate descriptive statistics in CASL*/
proc cas;
  tbl1.name = "class";                                             
  tbl1.groupBy = "sex";

  simple.summary /
     table = tbl1
     subSet = {"MAX", "MEAN", "MIN", "N", "NMISS", "STD", "STDERR"};
run;
quit; 

 

Results:

MayurJadhav_7-1721050623967.png

 

 

 

Conclusion

In conclusion, this comparative analysis between BASE SAS and CASL provides a comprehensive exploration of their respective strengths and applications in data analytics. I hope this head-to-head comparison with demonstrated examples would help you select the appropriate toolset based on specific analytical needs and infrastructure requirements.

 

This article guide you how to transform your BASE SAS code into CASL but you could learn more about “When to CASL and not to CASL, SAS programming in SAS Viya”. 

 

Whether you choose to leverage the advanced capabilities of CASL or retain some functionalities of BASE SAS, this choice will greatly impact how well you can understand and use your data to gain valuable insights.

 

 

 

References:

Version history
Last update:
‎07-26-2024 11:13 AM
Updated by:
Contributors

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 16. Read more here about why you should contribute and what is in it for you!

Submit your idea!

Free course: Data Literacy Essentials

Data Literacy is for all, even absolute beginners. Jump on board with this free e-learning  and boost your career prospects.

Get Started

Article Tags