BookmarkSubscribeRSS Feed

Autoload/unload/import of data via scheduled jobs in SAS Viya

Started ‎03-13-2018 by
Modified ‎08-17-2023 by
Views 14,568

Our old friend autoload from Visual Analytics 7.x is back, well sort of. Autoload like capabilities have been added to Visual Analytics on Viya. In this blog I will look at what you can and cannot do. The addition of this functionality is very useful, and it is a much simpler and cleaner implementation than autoload in VA 7.x. It is a nice addition to the existing just-in-time load functionality that Scott discussed in this post.

 

By now everyone should know what a CASLIB is. If you don't or need a refresher check out this video  SAS Viya CAS Libraries (Caslibs) Simplified

 

In SAS Viya a user is able to create and schedule jobs. It is this functionality that can be used to automatically load/unload/refresh in-memory data. There are four types of jobs that can be created:

 

  • CAS table state management
  • SAS Data Explorer
  • SAS Data Studio
  • SAS Visual Analytics

 

In this blog we will look at CAS table state management jobs. Using thee job administrator can automate the import, load, refresh and unload of source files into memory in CAS. In SAS Environment Manager three sample jobs, that can be used to create additional jobs are provided. The sample jobs:

 

  • import source files to sashdat format in an existing CASLIB. Import supports CSV, SAS7BDAT and EXCEL
  • load caslib source files to in-memory tables. You can load new tables and refresh tables that already exist in-memory
  • unload tables from memory. Tables can be unloaded ether immediately, or based on when table was last accessed.

 

CAS table state management jobs are documented in the System Administration guide under Data > CAS Table state management.

 

The jobs allow an administrator to setup a schedule where tables are added to memory, existing tables are refreshed from their source, and tables are removed from memory when they are no longer needed. In addition, the import job can be used to read files in various formats and import them to sashdat format in a caslib's source.

 

The sample jobs are provided as a template from which to create new jobs. Users may not edit the sample jobs, but they can be copied to create a new job, and the new jobs settings can be edited.

 

The Jobs can then be scheduled in SAS Environment Manager, or via the scheduling command-line interface. Currently the scheduler in SAS Viya supports simple time based events. Looking at the process to load data there are two moving parts.

  • the job
  • the schedule

The diagram below is an example of a load job.

 

autoload82_1.png

 

 

This short video below shows how to:

  • Copy an existing job
  • Edit the job arguments
  • Run and/or schedule the job
  • Check the status of the job and view the log

 

 

The trickiest step in the process is editing the job argument options.   The options are stored as a JSON-formatted string.

 autoload82_2.png

 

 

The documentation provides a tip which I have found very useful. Copy the json for options to a JSON editor, make the changes, and then copy the result back into options.  The copy paste method makes it less likely that you will make a mistake in the JSON which will cause your job to fail.

 

The addition of autoload-like capabilities is welcome. The functionality will allow administrators more control of when data in CAS is loaded/unloaded and refreshed. By combining the jobs more complex business processes can be implemented. For example, the import job could be scheduled in conjunction with the load job to read new data from an external process, load it to sashdat in a CASLIB's source and then subsequently lift it into memory refresh the data in-memory.

 

As I mentioned earlier these capabilities are documented in the in the Viya System Administration guide in the  CAS Table state management  section.      

Comments

good video

How would you apply a filter for only a subset of tables in the caslib? I've tried several different options and none of them seem to work.

If you take a look at the 3.4 documentation, there is a table that explains the filter syntax. To subset tables by name you can do something like the two below:

 

  • contains(name,'HR')
  • and(or(endsWith(sourceTableName,'.sashdat'), endsWith(sourceTableName,'.csv'), endsWith(sourceTableName,'.sas7bdat')),contains(name,'HR'))

 

https://documentation.sas.com/?cdcId=calcdc&cdcVersion=3.4&docsetId=caldatamgmtcas&docsetTarget=n150...

 

One thing to be aware of is that the table names are always upper case.

Author NOTE: the information in this post  is relevant to  the new release Viya 3.4 and VA 8.3. The details have changed for the better in the new release. The interface has change so that you no longer have to directly edit a JSON file.

It seems it only looks at tables which have not been accessed for a period of time. 

 

Is there a way to modify the job so that any CAS table which has a creation date greater than 48hours is unloaded?

 

That is correct, the unload can only be used based on the access datetime not the creation date time.

Is there any bug in SAS Viya 3.4, cause I'm unable to unload my dataset with unload sample job.
 

What happens when not all of the sashdat files on the corresponding file system fit into memory during a scheduled autoload?

Michael, take a look at this post by one of my colleagues 

 

https://communities.sas.com/t5/SAS-Communities-Library/Provisioning-CAS-DISK-CACHE-for-SAS-Viya/ta-p...

 

It discusses how CAS uses both memory and disk. CAS can process data that is bigger than the available memory. But you must have enough "resources" (memory+data) available, and if you are getting near the limits of that on you hardware then you will have problems. Hope that helps.

Version history
Last update:
‎08-17-2023 09:58 AM
Updated by:
Contributors

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

Free course: Data Literacy Essentials

Data Literacy is for all, even absolute beginners. Jump on board with this free e-learning  and boost your career prospects.

Get Started

Article Tags