- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hello,
I have a .sav file that I can’t get uploaded to SAS onDemand as it is over 1 GB, which is the limit for uploading a file to SAS OnDemand. What can I do to get it uploaded?
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
To share a picture use the insert picture icon instead of the attach file icon in the editor tool bar.
Try compressing the file and uploading the compressed file. For example try putting the file in a ZIP file. If the ZIP is then smaller than the limit you can upload and then use SAS code to unzip into your work folder. For example say your file was named myspss.sav and you zipped it into a file named myfile.zip that you uploaded to your home directory on the SAS server.
You could then use SAS code like this to convert it into a dataset named MYSPSS in the work directory and have any value "labels" (aka formats) generated into a format catalog named FORMATS in the work directory.
filename in zip '~/myfile.zip' member='myspss.sav' recfm=f lrecl-512;
filename copy temp recfm=f lrecl=512;
%let rc=%sysfunc(fcopy(in,out));
proc import dbms=sav datafile=copy
data=work.myspss replace
;
fmtlib=work.formats;
run;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Here is the picture inserted.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Given SAS ODA is a learning environment uploading such a huge file is likely not within the intended use. Having said that: It would be simple enough to split the file into smaller chunks on the client side and then upload these chunks (like under WSL/Ubuntu using something like: split -b 50MB big_file.sav chunk_ )
Where I got stuck is on the SAS ODA side because it doesn't allow for OS commands due to noxcmd set.
If this weren't the case then one could simply use a Unix cat command like cat chunk_* > big_file.sav
I've done some testing with cars.sas7bdat, split it on the client side into 1KB chunks, uploaded the chunks, used cat on the server side and then printed the recombined file using Proc Print - so I know the approach works as such.
@Tom Any idea how to "mimik" a cat command using SAS code? It must be something with recfm=N but I just couldn't make it work.
Attached the 1KB chunk files I've used for my testing.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Perhaps you had trouble because your chunks are actual 1,000 bytes and not 1Kb which would be 1,024 bytes?
filename out "%sysfunc(pathname(work))/cars.sas7bdat";
data _null_;
infile "&path/cars_chunks.zip" zip member="*" recfm=f lrecl=1000 ;
file out recfm=f lrecl=1000;
input ;
put _infile_ ;
run;
proc means data=cars;
run;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Thanks @Tom and yes you are correct. I've created chunks of 1000bytes.
I would have expected a solution with recfm=N.
For the OPs file 50MB chunks feel more appropriate. How would you make your approach work for such a case?
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
@Patrick wrote:
Thanks @Tom and yes you are correct. I've created chunks of 1000bytes.
I would have expected a solution with recfm=N.
For the OPs file 50MB chunks feel more appropriate. How would you make your approach work for such a case?
Same code would work if by 50MB you mean an even multiple of 1000. Or any LRECL between 1 and 32767 that evenly divides the "chunk" size.
If you have multiple zip files you will probably need to loop over a list of files instead of being able to do it with a single * wildcard. So get a list of the ZIP files and then try something like:
data _null_;
set filelist;
file 'cars.sas7bdat' recfm=f lrecl=1000;
infile dummy zip filevar=filename member='*' recfm=f lrecl=1000 end=eof;
do while (not eof);
input;
put _infile_;
end;
run;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
@Tom Oh, I see. The Unix split command allows to define the chunk size so I guess it just needs to be a multiple of whatever I define as lrecl (up to max 32KB). I'll run a test and let you know how that went.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
If compressing the .sav file into a ZIP does not bring the file size to below 1 GB:
You can convert the .sav file into a csv file and remove columns that may not be needed prior to uploading into SAS.
If all the data within the .sav file is needed, break the file into multiple parts and recombine once loaded into SAS.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
That might work, but then the user will have to combine the files in SAS OnDemand to do any meaningful analysis, and likely this will again hit some sort of SAS data set size limit. SAS OnDemand was not designed to handle humongous amounts of data.
Paige Miller
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content