Six Ways to Import a Local File into CAS (SAS Viya 3.5)
- Article History
- RSS Feed
- Mark as New
- Mark as Read
- Bookmark
- Subscribe
- Printer Friendly Page
- Report Inappropriate Content
Do not try and bend the spoon, that's impossible. Instead, only try to realize the truth... there is no spoon. Then you'll see that it is not the spoon that bends, it is only yourself. ―Spoon Boy to Neo
When you import a local file from your PC into CAS, do you feel sometimes you are asked to bend the spoon, just by looking at it? Here are 6 simple ways to import a local file in Cloud Analytic Services (CAS) and make it available in SAS® Viya™.
Why?
Simply put, no data, no analytics. It all starts with importing your files in CAS. And many files are still, on someone's computer.
What?
We will focus on a client side load: import a file stored on your PC and make it available in CAS. What is a client? A browser, a “SAS client” (SPRE or SAS 9.4M5), or your local storage.
The focus is not server side load: when the file is already on the CAS Controller, or on a location available to the CAS Controller or workers: mounted drive, Hadoop distribution, DNFS, etc. The next question is how?
Summary
Six ways to import a file into CAS (and maybe at least three to upload it): the focus this post will be on the cases marked in green. In the next posts, you will learn about file import with python and how to import local SAS datasets.
A bottleneck is the time to upload the file on the SAS platform, as it depends where you are and where the SAS servers are located, internet speed, etc..
A simple test shows PROC CAS and PROC CASUTIL performing much faster than PROC IMPORT (in my environment). The first two run in CAS, the latter in SPRE.
Assume you need to import a single file prdsale.csv in the table prdsale, the casuser CASLIB.
Use the Visual Interface: Local File
Visual Interface = SAS Data Explorer, SAS Environment Manager (Data tab), SAS Visual Analytics (Add data). There might be more interfaces. If I forgot one, please leave me a comment.
Pluses (+): easiest, most convenient and ubiquitous in Viya.
Attention points:
- Can only import a certain size through the browser (the CAS Management service option maxFileUploadSize is set to 4 Gb by default).You can override this option.
- Certain import options are not available (see Summary).
SAS Data Explorer example:
Pick the file from your PC.
...and off it goes into your CASLIB.
Use the Visual Interface : SAS Drive
As an alternative: Firstly, load your files in the SAS Drive folders.
Secondly, import with (your favorite) Visual Interface = SAS Data Explorer, SAS Environment Manager (Data), SAS Visual Analytics (Add data).
For example: use SAS Environment Manager: (file stored in SAS Content e.g. SAS Content / Users / <user> / My Folder)
Attention points:
- Can only import a certain size through the browser (the CAS Management service option maxFileUploadSize is set to 4 Gb by default). ).You can override this option.
- Certain import options are not available (see Summary)
Use the Programming Interface: SAS Studio V
This is a two-step process:
- upload.
- import.
Upload
Attention point: There is a 100 MB file upload limit.
Option 1: reuse the file uploaded in SAS Content / Users / <user> / My Folder
Option 2: upload to intviya01 > Home
You can upload the file somewhere else, for example, on the SAS Viya Service Layer Server in intviya01 > Home
Option 3: file transfer (FTP):
The file stored in the SAS Viya Service Layer Server in /home/user/ . The location is identical with intviya01 > Home in SAS Studio V.
Pluses (+): You can upload files bigger than CAS Management service’s maxFileUploadSize.
Attention points:
- You may not have access to FTP at a client site.
- If you do have FTP access, then you would might be better off to do server side loading not client side (not the focus of this post). FTP the file directly on the CAS Controller, or on a mounted drive. Choose a path of a CASLIB you have access to.
Upload Myth Buster*:
Q: Which upload method is slower:
- a file through the browser
- upload it in SAS Studio or SAS Drive
- FTP?
A: Note that, if the file is sitting on your local hard drive, it takes no longer to upload the file via the web UIs than it does to copy the file to somewhere that SAS can see it and then run SAS code that loads the data into SAS and sends it across the wire to CAS. In fact, it’s probably faster in most cases.*
Q: Can you import only 4 GB through the browser? I read the CAS Management service option maxFileUploadSize is set to 4 Gb by default.
A: If the customer has frequent need to upload files larger than 4GB, changing the CAS Management service maxFileUploadSize is a good option. I’ve uploaded files as large as 20GB through the service and see no reason why you couldn’t go much larger.*
Q: Can you use all the available import options?
A: Certain import options are not available (see Summary)
*reviewed 2020 Feb 14. Thanks to David H.
Import: Use SAS Code
When to code and not use the interface?
- When you need more control over the load options.
- Obviously, you can schedule code, the import part.
- The file has to arrive from your PC in a location accessible to SAS Viya.
Remember: when you code, drop your target CAS table first, or specify replace or append in the load options.
PROC CAS: Upload CAS Action
Pluses (+):
- FAST. Runs in CAS. CAS manages the upload.
- Fine-tune the loading options at a column level: use import Options to apply formats, change types and more.
- Go for proc cas if fine-tuning is your thing or your file structure is... difficult.
* Drop in-memory CAS table;
proc casutil ;
droptable casdata="&gateuserid._CSV_prdsale" incaslib="casuser" quiet;
quit ;
* table.upload cas action;
proc cas;
upload result=r status=rc /
path="&csvdata"
casOut={
caslib="casuser",
label="CSV file from client",
lifetime=999,
name="&gateuserid._CSV_prdsale",
promote=TRUE,
replication=0
}
importOptions={fileType="CSV", vars={{name="ACTUAL", format="DOLLAR8.2"}}
}
;
quit;
For macro definitions, see the Code Wrapper at the end. The code imports the same file but from /home/user/ . The location is identical with intviya01 > Home in SAS Studio V.
The Log: the file "binary data" is handled by CAS.
NOTE: The table SBXBOT_CSV_PRDSALE has been created in caslib CASUSER(sbxbot)
from binary data uploaded to Cloud Analytic Services.
NOTE: Action 'table.upload' used (Total process time):
NOTE: real time 0.102572 seconds
NOTE: cpu time 0.117076 seconds (114.14%)
NOTE: total nodes 5 (20 cores)
NOTE: total memory 156.32G
NOTE: memory 36.41M (0.02%)
NOTE: PROCEDURE CAS used (Total process time):
real time 0.11 seconds
cpu time 0.00 seconds
PROC CASUTIL Load File
Pluses (+):
- FAST. Runs in CAS. CAS manages the upload.
- Generates a table.upload cas action.
* Drop in-memory CAS table;
proc casutil ; droptable casdata="&gateuserid._CSV_prdsale" incaslib="casuser" quiet; quit ;
* Proc casutil - file load .csv from client machine to CAS library;
proc casutil ;
load file="&csvdata"
outcaslib='casuser' casout="&gateuserid._CSV_prdsale" copies=0 promote; quit;
The Log:
NOTE: The table SBXBOT_CSV_PRDSALE has been created in caslib CASUSER(sbxbot) from binary data uploaded to Cloud Analytic Services.
NOTE: Action 'table.upload' used (Total process time):
NOTE: real time 0.104739 seconds
NOTE: cpu time 0.120821 seconds (115.35%)
NOTE: total nodes 5 (20 cores)
NOTE: total memory 156.32G
NOTE: memory 36.69M (0.02%)
Use a SAS Studio V Import Task
One of my colleagues used to say: "there is a CAS Action for everything". Apparently, there is a SAS Studio task for everything as well...
Pluses (+): easy, generates the code for you.
Attention point:
- Does not run in CAS. The SAS client (SPRE) runs the code and pushes the data in a CAS table.
New > Import data. Will choose the location from SAS Drive. SAS Content / Users / <user> / My Folder
The task generates PROC IMPORT / PROC CONTENTS code that runs in SPRE (the SAS engine in Viya).
And the log proves it:
NOTE: The INFILE statement is not supported with DATA step in Cloud Analytic Services.
NOTE: The INPUT statement is not supported with DATA step in Cloud Analytic Services.
NOTE: Could not execute DATA step code in Cloud Analytic Services. Running DATA step in the SAS client.
...
NOTE: The data set CASUSER.SBXBOT_CSV_PRDSALE has 1440 observations and 10 variables.
NOTE: DATA statement used (Total process time):
real time 0.34 seconds
cpu time 0.10 seconds
PROC IMPORT
You just saw an example above. For your convenience, the code:
FILENAME REFFILE FILESRVC FOLDERPATH='/Users/sbxbot/My Folder' FILENAME='prdsale.csv';
PROC IMPORT DATAFILE=REFFILE
DBMS=CSV
OUT=CASUSER.SBXBOT_CSV_PRDSALE;
GETNAMES=YES;
RUN;
PROC CONTENTS DATA=CASUSER.SBXBOT_CSV_PRDSALE; RUN;
Conclusions
You read six ways to load your local files in CAS (and at least three ways to upload). Use the following ‘rules-of-the-thumb’:
1-2 Convenience and some options: use the visual interfaces
3 Speed and many options: use PROC CAS upload action
4 Speed and some options: use PROC CASUTIL
5-6 Less speed, medium options: use PROC IMPORT or the SAS Studio V task.
In a next post, you will learn more about:
- file import using python (SWAT)
- recap the dataset import.
Stay tuned for more stories. And please comment, share and help others.
References
I would recommend the following resources:
- SAS® Data Explorer 2.5: User’s Guide (visual Interface)
- SAS Viya System Programming Guide (cas actions)
- PROC CASUTIL load option
- UPLOAD CAS action
- SAS® Studio 5.2: User’s Guide (import task)
Acknowledgements
Stephen Foerster, Mary Kathryn Queen, Nicolas Robert, Uttam Kumar, David H.
Want to try yourself the content of this post?
Code Wrapper
The following code simulates a file being uploaded in /home/&gateuserid/
* Wrapper code;
CAS mySession SESSOPTS=( CASLIB="casuser" TIMEOUT=999 LOCALE="en_US" metrics=true);
%let gateuserid=&sysuserid ;
%put My Userid is: &gateuserid ;
options msglevel=i ;
caslib _all_ assign;
* Define the files loaded;
%let csvdata=/home/&gateuserid/prdsale.csv;
%let dsdata=/home/&gateuserid/prdsale.sas7bdat;
%let folder=/home/&gateuserid./;
* Upload the csv files in /home/&gateuserid. folder;
* or Simulate upload: save a csv file under user’s home directory;
proc export data=sashelp.prdsale
outfile="&csvdata" REPLACE dbms=dlm;
putnames=yes; delimiter=',';
run;
proc copy in=DMS out=indata;
select prdsale;
run; quit;
proc contents data=indata.prdsale;
run;
libname indata clear; libname DMS clear;
*The code from each proc goes in here;
* list files and in-memory tables;
proc casutil incaslib="casuser" ;
list files; list tables;
quit;
*When you’re finished, clean-up;
* Drop in-memory CAS table;
proc casutil ;
droptable casdata="&gateuserid._CSV_prdsale" incaslib="casuser" quiet;
droptable casdata="&gateuserid._DATA_prdsale" incaslib="casuser" quiet;
quit ;
CAS mySession TERMINATE;
Want to Learn More about Viya 3.5?
- Two Simple Ways to Import Local Files with Python in CAS (Viya 3.5)
- DevOps Applied to SAS Viya 3.5: Run a SAS Program with a Jenkins Pipeline
- DevOps Applied to SAS Viya 3.5: Top Git Commands with Examples
- Query Performance? Use a CAS Star Schema in Viya 3.5
- Automatic Data Loading with a CAS Data Source Star Schema in Viya 3.5
- Go With the Job Flow in SAS Viya 3.5
- How to Load Images in SAS Viya 3.5
- Improve Your Relationships: With a REST API in SAS Viya 3.5
Thank you for your time reading this post. Please comment and share your experience with the Local File Import in CAS and help others.