BookmarkSubscribeRSS Feed
alisio_meneses
Quartz | Level 8

Hi there,

 

I see 'PROC CASUTIL SAVE'  can write an in memory CAS table to multiple parquet files stored inside a CASLIB folder. I wonder if it is possible to do it into single file parquet file.

 

Is it? if so, how?

 

Env. Info: SAS Viya 3.5 running on linux with multiple cas workers.

 

Thank you!

4 REPLIES 4
sbxkoenk
SAS Super FREQ

... to multiple parquet files??

Strange.
By default, one CAS-table goes into one .parquet file.
Why would it be split across multiple *.parquet files?

proc casutil;
   save casdata="carsInMemory" casout="carsFile.parquet";
run;

See here :

SAS® 9.4 and SAS® Viya® 3.5 Programming Documentation
CAS User’s Guide
Parquet Data Sets

https://go.documentation.sas.com/doc/en/pgmsascdc/9.4_3.5/casref/p0u5p2nvqu04gfn1w3zaohdfcoys.htm#n0...

 

Koen

Patrick
Opal | Level 21

@sbxkoenk In my environment with a recent Viya 4 version and 4 worker nodes the parquet file gets created in chunks (=multiple files) all stored under a folder with the name of the parquet file that had been provided as value to parameter casout. 

I couldn't find a way to only create a single file using Proc Casutil. I do believe that chunks are required for full support of parallelism.

I could create a single parquet file via client side (compute) processing using a data step.

%let sessref=MySess;
%if %sysfunc(sessfound(&sessref)) %then
  %do;
    cas mySess terminate;
  %end;
cas &sessref cassessopts=(caslib="casuser" /*metrics=True*/);
libname casuser cas;
options fullstimer msglevel=i ps=max;

data casuser.class;
  set sashelp.class;
run;


libname comp_pq parquet "&_userhome";

data comp_pq.class_datastep;
  set casuser.class;
run;

caslib cas_pq path="&_userhome" datasource=(srctype="path");
proc casutil;
  save casdata="class" incaslib="casuser" casout="class_casutil.parquet" replace; 
quit;

/* cas mySess terminate; */

 

 

sbxkoenk
SAS Super FREQ

Sorry. I had better not said anything. Too little experience with Parquet files. ☹️😞

 

Maybe @UttamKumar can help.

 

SAS Viya and Parquet files – additional features
Started ‎01-26-2023 | Modified ‎01-26-2023

by UttamKumar
https://communities.sas.com/t5/SAS-Communities-Library/SAS-Viya-and-Parquet-files-additional-feature...

 

Koen

alisio_meneses
Quartz | Level 8
hello, thanks for the reply. Using save generates a single folder named <tablename.parquet> with multiple parquet files inside. I guess thats for optimization purposes. Not sure.

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!

Submit your idea!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 4 replies
  • 833 views
  • 2 likes
  • 3 in conversation