BookmarkSubscribeRSS Feed
alisio_meneses
Quartz | Level 8

Hi there,

 

I see 'PROC CASUTIL SAVE'  can write an in memory CAS table to multiple parquet files stored inside a CASLIB folder. I wonder if it is possible to do it into single file parquet file.

 

Is it? if so, how?

 

Env. Info: SAS Viya 3.5 running on linux with multiple cas workers.

 

Thank you!

4 REPLIES 4
sbxkoenk
SAS Super FREQ

... to multiple parquet files??

Strange.
By default, one CAS-table goes into one .parquet file.
Why would it be split across multiple *.parquet files?

proc casutil;
   save casdata="carsInMemory" casout="carsFile.parquet";
run;

See here :

SAS® 9.4 and SAS® Viya® 3.5 Programming Documentation
CAS User’s Guide
Parquet Data Sets

https://go.documentation.sas.com/doc/en/pgmsascdc/9.4_3.5/casref/p0u5p2nvqu04gfn1w3zaohdfcoys.htm#n0...

 

Koen

Patrick
Opal | Level 21

@sbxkoenk In my environment with a recent Viya 4 version and 4 worker nodes the parquet file gets created in chunks (=multiple files) all stored under a folder with the name of the parquet file that had been provided as value to parameter casout. 

I couldn't find a way to only create a single file using Proc Casutil. I do believe that chunks are required for full support of parallelism.

I could create a single parquet file via client side (compute) processing using a data step.

%let sessref=MySess;
%if %sysfunc(sessfound(&sessref)) %then
  %do;
    cas mySess terminate;
  %end;
cas &sessref cassessopts=(caslib="casuser" /*metrics=True*/);
libname casuser cas;
options fullstimer msglevel=i ps=max;

data casuser.class;
  set sashelp.class;
run;


libname comp_pq parquet "&_userhome";

data comp_pq.class_datastep;
  set casuser.class;
run;

caslib cas_pq path="&_userhome" datasource=(srctype="path");
proc casutil;
  save casdata="class" incaslib="casuser" casout="class_casutil.parquet" replace; 
quit;

/* cas mySess terminate; */

 

 

sbxkoenk
SAS Super FREQ

Sorry. I had better not said anything. Too little experience with Parquet files. ☹️😞

 

Maybe @UttamKumar can help.

 

SAS Viya and Parquet files – additional features
Started ‎01-26-2023 | Modified ‎01-26-2023

by UttamKumar
https://communities.sas.com/t5/SAS-Communities-Library/SAS-Viya-and-Parquet-files-additional-feature...

 

Koen

alisio_meneses
Quartz | Level 8
hello, thanks for the reply. Using save generates a single folder named <tablename.parquet> with multiple parquet files inside. I guess thats for optimization purposes. Not sure.

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 4 replies
  • 1878 views
  • 2 likes
  • 3 in conversation