BookmarkSubscribeRSS Feed

Load All Files in Directory to CAS

Started ‎01-10-2023 by
Modified ‎03-01-2023 by
Views 1,046

While this topic has been covered before in relation to other topics (CASL variables, asynchronous CAS actions, ...) and discussed on various forums, I could not find a singular location where it was covered exclusively and was easily located. Since the topic is requested often, I figured I'd dedicate an entire post to it so hopefully people can find it easily. At the end of the post, I'll try to reference all of the other contents out there.

 

The Basics

 

To perform this operation we'll need a path CASLib, the fileInfo and loadTable CAS Actions along with CASL's processing logic and result variable capabilities. Check out the links provided for background on each.

 

First, we create our path caslib pointing to the directory holding the files. Then we use the fileInfo action to list the (loadable) files in the directory and send that list into a result variable. Then we simply loop over that list with the loadTable action to load the files into CAS.

 

At its most basic, the code looks like this:

 

caslib mylib path="/local/data";

proc cas;
  table.fileinfo result=filesToLoad/ caslib="mylib";

  do i = 1 to dim(filesToLoad.FileInfo);
      fname = filesToLoad.FileInfo[i,'Name'];
      table.loadtable /
         caslib="mylib"
          path=fname;
  end;
run;
quit;

 

The key to the scheme is the "filesToLoad" CASL variable that is assigned with the result= on the fileInfo action. It is actually a dictionary that contains a table, FileInfo, with all kinds of information about the files in the data source directory. In the example above, we iterate over the file name, (filesToLoad.FileInfo[i,'Name'].

 

Going Beyond

 

This approach will also work with database and other types of caslibs since the fileInfo action always gives you the data source file or table information from the caslib.

 

If you want to make the process more robust check out these links to other resources:

 

   Iterate over the source files and load them in parallel

 

   A SAS Studio Flow Custom Step that loads all of the files from a file system directory

 

 

Comments

I love it! thank you for publishing this.

Version history
Last update:
‎03-01-2023 04:05 PM
Updated by:

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

Free course: Data Literacy Essentials

Data Literacy is for all, even absolute beginners. Jump on board with this free e-learning  and boost your career prospects.

Get Started

Article Labels
Article Tags