BookmarkSubscribeRSS Feed

Ways to Handle the 100mb Data Upload Restriction in SAS Viya for Learners, Part I

Started ‎03-08-2023 by
Modified ‎03-23-2023 by
Views 3,862

Part I: How much can you (com)press?

 

In SAS Viya for Learners (VFL), academics are provided with 5G of cloud storage space. To contain ongoing cloud-costs – and to ensure a positive user-experience for all learners – SAS VFL administrators have limited individual file upload to a max of 100mb per file.

 

But… what if your file is great than 100mb?

 

All is not lost. This series of library articles provides you with options to get your data into Viya for Learners when files are “large”. One approach is to compress the file, upload that significantly smaller file, and then decompress the file in VFL. Another approach is to break the file into smaller piece, upload those pieces (which could also be compressed), and recombine the pieces in VFL. Those examples will be handled in Part I and Part II of this series, respectively.

 

If it sounds like you’re in the right place, then let’s get started!

 

We’ll begin our journey by launching VFL3.5 and landing on the SAS Drive page. Find the helpful little Hamburger in the upper left-hand corner, which is also labeled Show list of applications:

 

LGroves_0-1678286227002.png

 

Click on the Hamburger and then select Develop SAS Code:

 

LGroves_1-1678286227022.png

 

This will open SAS Studio:

 

LGroves_2-1678286227114.png

 

If this is your first time in SAS Studio, welcome! We’ll start by uploading data into VFL. Many of you may know the process. But, I suspect that not all of you do. For the latter, let this be a (hopefully) helpful introduction.

From the left pane, you’ll notice several options. Click on Explorer icon and then find the casuser folder under Home in your user directory. My user directory is pccesc23116:

 

LGroves_3-1678286227128.png

 

Uploading files in VFL is a mere right-click away, once you’ve found the right location to place the file. Let’s put our uploaded data into the casuser folder, although we could create a subfolder for this project. But, let’s keep it simple. Right click on casuser and find Upload files:

 

LGroves_4-1678286227154.png

 

A new dialog box appears:

 

LGroves_5-1678286227180.png

 

As the headlining note suggests, the size limit for each selected file is 100 mb. And, yes, this is a hard limit – so files larger than this threshold will be rejected.

 

One cleaver way to get around that 100 mb limit is to upload a compressed file that is under 100 mb – when compressed – and larger than 100 mb when uncompressed. In this example, I’m going to upload a .XLSX dataset that I used for a Data & Analytics for Good Journal article that is focused on vulnerable high school students during the early stages of the COVID pandemic. The compressed file is on my local drive and the zipped file appears as:

 

LGroves_6-1678286227186.png

 

Yes, it’s small (1.6 MB). But this is just for illustrative purposes. After the data sets are selected (just one here), click Upload. The file then appears under the casuser folder:

 

LGroves_7-1678286227191.png

 

You can try to double-click the file to open it – but nothing good is gonna happen. So, let’s create a new SAS Program and use a code-based solution! Click on the New tab button to create a new SAS Program:

 

LGroves_8-1678286227193.png

 

And we’ll adapt code provided by SAS All-Star Chris Hemedinger, found in this blog:

https://blogs.sas.com/content/sasdummy/2015/05/11/using-filename-zip-to-unzip-and-read-data-files-in...

More specifically, let’s identify the file that we’d like to unzip in the first line of your new program:

 

filename inzip ZIP "/shared/home/Lincoln.Groves@sas.com/casuser/Vulnerable_Populations_DataSet.zip";

 

Ensure that the “…/vulnerable_populations_dataset.zip” matches your file location in VFL. And a gentle reminder that right-clicking on the file name and then selecting Properties yields the File Properties window, one that looks like this:

 

LGroves_10-1678286227217.png

 

The next step is to modify this key block of code from Chris Hemedinger’s blog:

 

LGroves_11-1678286227239.png

 

More specifically, our desired upload file has a different name. So, let’s make the following changes:

 

/* identify a temp folder in the WORK directory */
filename xl "%sysfunc(getoption(work))/Vulnerable_Populations_DataSet.xlsx" ;
 
/* hat tip: "data _null_" on SAS-L */
data _null_;
   /* using member syntax here */
   infile inzip(Vulnerable_Populations_DataSet.xlsx) 
       lrecl=256 recfm=F length=length eof=eof unbuf;
   file   xl lrecl=256 recfm=N;
   input;
   put _infile_ $varying256. length;
   return;
 eof:
   stop;
run;
 
proc import datafile=xl dbms=xlsx out=Vulnerable_Populations_DataSet replace;
run;

 

In the PROC IMPORT statement, note that I dropped the sheet= statement and saved the file to the working directory (how do we know this? Well, no permanent library is part of the out statement). I highly recommend saving these files to the working directory while you are wrangling the data, as WORK will be cleared after you session is over. And this is a good thing: as it will save space on your 5G data limit in VFL.

 

Finally, let’s confirm that we’ve created a temporary SAS file from the imported (and compressed) .XLSX file. From the left pane, find the Libraries icon. Then navigate to Libraries >> Work >> Vulnerable_Populations_DataSet:

 

LGroves_13-1678286227302.png

 

Double click the data set to open it:

 

LGroves_14-1678286227357.png

 

Yay! That looks like a SAS Data set to me! Moreover, know that the process above works for zipped .CSV and SAS files – you’ll just need to tweak the code slightly.

 

Finally, before wrapping up Part I, I’ll provide a link to the underlying dataset, in case you’d like to try this out for yourself. Grab the data here: https://github.com/lincolngroves/Data-and-Analytics-for-Good-Journal-v1

 

See you in Part II!

Comments

These SAS Community Posts were part of a larger effort to support student engagement in the 2023 SAS Hackathon.  Please find the full series below:

 

Version history
Last update:
‎03-23-2023 09:22 AM
Updated by:

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!

Free course: Data Literacy Essentials

Data Literacy is for all, even absolute beginners. Jump on board with this free e-learning  and boost your career prospects.

Get Started

Article Tags