BookmarkSubscribeRSS Feed

Six Easy Ways to Import Local SAS Data Sets into CAS (SAS Viya 3.5)

Started ‎07-21-2020 by
Modified ‎09-07-2020 by
Views 12,341

Have you noticed, whenever you are hungry, there is always a can that will just not open? Hungry for analytics? SAS Viya, like any good multifunctional tool, offers many functions to import a local SAS data set, stored on your PC, into CAS. The most common six are discussed in this post.

 

The What

The focus is on a client-side load (CAS-speak for importing and loading a data set you have stored locally). Client-side load is more appropriate for small files. A client is a browser, a “SAS client” (SPRE or SAS 9.4M5), or your local storage. The focus is not on server-side load: when the file is already on the CAS Controller, or on a location available to the CAS Controller or workers (mounted drive, Hadoop distribution, DNFS, etc.). Server-side load is recommended for bigger volumes and faster loading.

 

The How

Six ways to import your local sas7bdat files into CAS will be discussed. From a multifunctional tool, some prefer the can opener, others the blade. SAS Viya 3.5 lets you choose the tool of your choice:

  1. For convenience: use the visual interfaces
  2. For minimum programming: use PROC CASUTIL
  3. - 4. For the range of options: use PROC CAS table.upload CAS action
  4. - 6. Familiarity: use the data step or any SAS 9 PROC

100 SAS Viya multi functional knife sas7bdat import.PNG

 

In the following examples, we will import a single file prdsale.7bdat on the SAS Viya Service Layer Server.

 

Use the Visual Interface

A visual interface means SAS Data Explorer, SAS Environment Manager (Data tab), SAS Visual Analytics (Add data). This is by far the easiest, most convenient and ubiquitous of importing files in SAS Viya. (There might be more interfaces. If I forgot one, please leave me a comment). 

 

SAS Data Explorer example

Add data sources: Local Files > Local file:

 

2_20-SAS-Data-Explorer-Local-file-300x234.png

 

Pick the dataset from your PC:

 

3_110-SAS-Data-Explorer-Local-sas7bdat-import-options-1024x590.png

 

The dataset is uploaded in a temporary location then loaded into a CAS in-memory table.

 

Import options

The following options are available for SAS7BDAT files:

  • Password: If the file is password-protected, enter the password for the file. If the file is password-protected and you do not enter the password in this field, then the file fails to import.
  • Encryption key: If the file is encrypted, enter the encryption key for the file. Otherwise, the file fails to import.
  • Character multiplier: Modify this option to increase the number of characters that can fit in each cell so that character data truncation does not occur. The lengths for character variables can be increased by multiplying the current length by the value that you specify. You can specify a multiplier value from 1 to 5. The default value is 1. If your data contains international characters, and the data is not importing correctly, you might want to specify a higher value.

 

SAS Drive

Firstly, upload your files in the SAS folders, e.g. SAS Content / Users / <user> / My Folder.

 

4_120-SAS-Drive-local-sas7bdat-upload-1024x309.png

 

Secondly, import with (your favourite) visual interface, SAS Data Explorer, SAS Environment Manager (Data), SAS Visual Analytics (Add data).

 

5_130-SAS-Environment-Manager-local-sas7bdat-import-1024x791.png

 

 

Visual Interfaces Summary

Using the visual interfaces is the easiest, most convenient way of importing data sets in Viya. Attention points:

  • Can only import a certain size through the browser (the CAS Management service option maxFileUploadSize is set to 4 Gb by default).
  • If a customer has a frequent need to upload files larger than 4GB, changing the CAS Management service’s maxFileUploadSize is a good option*. Users have uploaded files as large as 20GB through the service and it could function for much larger files.
  • Certain import options available when coding, are not offered in the interface.
  • Note that, if the file is sitting on your local hard drive, it takes no longer to upload the file via the web UIs than it does to copy the file to somewhere [where] SAS can see it.*
  • * Input provided by David Henderson.

 

Use the Programming Interface: SAS Studio

This is a two-step process: upload to SAS Viya, then load in CAS.

 

Upload

Option 1: SAS Content

For upload, you could use the visual interfaces. The default upload location is SAS Content / Users / <user> / My Folder.

 

6_120-SAS-Drive-local-sas7bdat-upload-1024x309.png

 

 

Option 2: SAS Studio 5.x

You can upload the sas7bdat somewhere else, for example, on the SAS Viya Service Layer Server in intviya01 > Home

 

150 SAS Studio V Local sas7bdat upload 2.png

Attention point: There is a 100 MB file upload limit.

 

Option 3: file transfer (FTP)

The data set can be uploaded to the SAS Viya Service Layer Server in /home/user/ . The location is identical with intviya01 > Home (for the user) in SAS Studio.

 

8_140-FTP-Local-sas7bdat-on-SAS-Viya-server.png

 

 

FTP Pluses (+): You can upload files bigger than CAS Management service’s maxFileUploadSize (4 GB by default but configurable).

 

FTP Minuses (-): You may not have access to FTP at a client site.

 

Extra

For very large files [and frequent reloads], the SAS data sets (SAS7BDAT files) can be loaded in parallel in an MPP architecture (1 CAS controller, many CAS workers). This is server-side loading, not client-side. You should read Rob Collum’s paper (№ 4: BASE SAS ENGINE DATA SETS).

Server-side is not the scope of this post, but you should be aware of this technique.

 

Import: Use SAS Code

When to code and not use the user interface?

  • When you need more control over the load options.
  • Obviously, you can schedule the code or include it in a larger process.
  • The data set must arrive from your PC in a location accessible to SAS Viya.

PROC CASUTIL

Use CASUTIL (the same as for a flat-file, see previous article Six Ways to Import a Local File into CAS (SAS Viya 3.5) ).

* Define the user;
%let gateuserid=&sysuserid ;
%put My Userid is: &gateuserid ;
options msglevel=i ;

* Define the Viya folder;
%let folder=/home/&gateuserid./;

* Assign a BASE SAS libname;
libname indata "&folder";

* Load the prdsale.sas7bdat in the location;
* Proc casutil load SAS data sets from client machine to CAS library;
* where clause, replace, repeat, and compress statement during data load;
proc casutil ;
load data=indata.prdsale (where=(country="U.S.A."))
outcaslib="casuser" casout="&gateuserid._DATA_prdsale" copies=0 replace;
quit;

Tip: when you load a table, you need to drop your target CAS table first or specify replace, append in the load options, depending on your use case.

 

The log explains that PROC CASUTIL load data statement calls a table.addTable CAS action.

NOTE: Executing action 'table.addTable'.
NOTE: Action 'table.addTable' used (Total process time):
NOTE:       real time               0.026405 seconds
NOTE:       cpu time                0.045080 seconds (170.73%)
NOTE:       total nodes             5 (20 cores)
NOTE:       total memory            156.32G
NOTE:       memory                  5.13M (0.00%)
NOTE: INDATA.PRDSALE was successfully added to the "CASUSER(sbxbot)" caslib as "SBXBOT_DATA_PRDSALE".
92   quit;
NOTE: PROCEDURE CASUTIL used (Total process time):
      real time           0.03 seconds
      cpu time            0.02 seconds

 

PROC CAS

 

upload CAS Action

The following is a variation of the addTable CAS action. The PROC CAS upload statement loads client-side data to the server. The file is transferred to the server. It is then stored as a temporary file. The file is loaded in an in-memory table. Once loaded, the file is then removed. Loading data from a client-side file is appropriate for small data sets and when you are learning to program with CAS actions. When the data sets become larger, it is more efficient to use a server-side load with the loadTable action to access data.

 

* Load the prdsale.sas7bdat in the location;
%let dsdata=/home/&gateuserid/prdsale.sas7bdat;  
proc cas;
table.droptable / caslib="casuser" name="&gateuserid._DATA_prdsale" quiet=TRUE;
   upload path="&dsdata"               
        casOut={
        name="&gateuserid._DATA_prdsale"                                      
        caslib="casuser"
      }
     importOptions={fileType="basesas"}  
;
run;

 

The log shows, the "binary data", the data set, is handled by CAS.

NOTE: Executing action 'table.upload'.
NOTE: Executing action 'table.loadTable'.
NOTE: Cloud Analytic Services made the uploaded file available as table SBXBOT_DATA_PRDSALE in caslib CASUSER(sbxbot).
NOTE: Action 'table.loadTable' used (Total process time):
NOTE:       real time               0.078838 seconds
NOTE:       cpu time                0.098398 seconds (124.81%)
NOTE:       data movement time      0.007006 seconds
NOTE:       total nodes             5 (20 cores)
NOTE:       total memory            156.32G
NOTE:       memory                  16.16M (0.01%)
NOTE:       bytes moved             168.93K
NOTE: The table SBXBOT_DATA_PRDSALE has been created in caslib CASUSER(sbxbot) from binary data uploaded to Cloud Analytic Services.
NOTE: Action 'table.upload' used (Total process time):
NOTE:       real time               0.104025 seconds
NOTE:       cpu time                0.132942 seconds (127.80%)
NOTE:       total nodes             5 (20 cores)
NOTE:       total memory            156.32G
NOTE:       memory                  16.16M (0.01%)
{caslib=CASUSER(sbxbot),tableName=SBXBOT_DATA_PRDSALE}

 

addTable CAS action

The table.addTable documentation states that it is used by a client-specific function or method for transferring data to CAS. Product documentation advises using table.upload and never table.addTable directly. However, if today you would use addTable within proc cas, you would still manage to get a file loaded in CAS.

 

PROC CASUTIL and CAS actions summary

 

9_105-load-data-table.addTable-table.upload-table.loadTable-192x300.png

 

Data Step

You can also use a data step to import sas data sets from a SAS library.

 

* Library definition points to the folder where the data set was uploaded;
libname indata "/home/&sysuserid./";

* Drop the in-memory CAS table; proc casutil ; droptable casdata="&gateuserid._DATA_prdsale" incaslib="casuser" quiet; quit ; * Load SAS data sets from client to CAS; data casuser.&gateuserid._DATA_prdsale (copies=0 replace=yes) ; set indata.prdsale ; run;

 

The log shows that the data step runs in "the SAS client", SPRE, the SAS 9 engine in SAS Viya. The data is loaded in CAS.

 

NOTE: To run DATA step in Cloud Analytic Services a CAS engine libref must be used with all data sets and all librefs in the 
      program must refer to the same session.
NOTE: Could not execute DATA step code in Cloud Analytic Services. Running DATA step in the SAS client.
...
NOTE: There were 1440 observations read from the data set INDATA.PRDSALE.
NOTE: The data set CASUSER.SBXBOT_DATA_PRDSALE has 1440 observations and 10 variables.
NOTE: Executing action 'table.addTable'.
NOTE: DATA statement used (Total process time):
      real time           0.06 seconds
      cpu time            0.03 second

 

Use a SAS Proc

You can use any SAS procs to read a data set and create the output into a CASLIB (PROC SQL, PROC PRINT, PROC CONTENTS, etc.). For example:

 

* Drop in-memory CAS table;
proc casutil ;
   droptable casdata="&gateuserid._DATA_prdsale" incaslib="casuser" quiet;
quit ;
/* load SAS data sets using SAS procs-sql from client machine to CAS */
proc sql noprint ;
   create table casuser.&gateuserid._DATA_prdsale as 
	select ACTUAL, PREDICT, COUNTRY, REGION, DIVISION, PRODTYPE, PRODUCT,
	QUARTER, YEAR, MONTH
   from indata.prdsale order by country, year ;
quit ;

 

The log shows the execution.

Note that PROC SQL is not "CAS-enabled", in other words it runs in SPRE. The data is loaded in CAS.

 

NOTE: Table CASUSER.SBXBOT_DATA_PRDSALE created, with 1440 rows and 10 columns.
92 quit ;
NOTE: PROCEDURE SQL used (Total process time):
real time 0.07 seconds
cpu time 0.04 seconds

 

Conclusions

SAS Viya offers multiple functions to import your local sas7bdat files into CAS. We reviewed six functions:

  1. For convenience: use the visual interfaces.
  2. For minimum programming: use PROC CASUTIL.
  3. -4. For many options: use PROC CAS table.table.upload CAS action (addTable also explained).
  4. -6 Familiarity: use the data step or any SAS 9 PROC.

Stay tuned for more stories. And please comment, share and help others.

 

References

I would recommend the following resources:

 

Acknowledgments

 

David Henderson for his comments. Nicolas Robert for reviewing, Stephen Foerster, Mary Kathryn Queen, Uttam Kumar.

Thank you for your time reading this post. Please comment and share your experience with the Local File Import in CAS and help others. If you wish to get more information, please write me an email.

 

Version history
Last update:
‎09-07-2020 10:57 PM
Updated by:
Contributors

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

Free course: Data Literacy Essentials

Data Literacy is for all, even absolute beginners. Jump on board with this free e-learning  and boost your career prospects.

Get Started