BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Muandre
Obsidian | Level 7

Dear all,

 

uploading und promoting my 800 MB SAS table leads to a 50 GB CAS table. What is my mistake?

 

cas;
caslib _ALL_ assign;

libname BIB1 oracle uid=*** user=*** ... ;

data work.test(compress=yes);
set BIB1.ds;
run;

 

proc casutil;
load data=work.test outcaslib="public"
casout="test" promote;
run;

1 ACCEPTED SOLUTION

Accepted Solutions
ErikLund_Jensen
Rhodochrosite | Level 12

Hi @Muandre 

 

We ran into the same problem and found out that CAS supports the data type VARCHAR, so we dropped all experiments with compress.

 

We write the CAS table with a data step that defines new CAS variables with following statements:

  • length newvar varchar(n);  *where n = longes string in input;
  • newvar = oldvar;
  • drop oldvar;

and so on..

 

In our setup the code is built into a DI Studio transformation that generates the data step from the actual input column specifications with proc execute. The incoming variables are renamed in the set statement, so the original names can be reused as new variable names.

 

View solution in original post

7 REPLIES 7
SASKiwi
PROC Star

I suspect CAS tables are not compressed. What is the size of your SAS WORK table test when uncompressed.

Muandre
Obsidian | Level 7

my sas work.test table is uncompressed 30 GB. What is the correct comress-way to load the compressed table to CAS?

Kurt_Bremser
Super User

CAS is built for speed and not for space, so compression is not wanted.

Get to know your data better; since you have a rather high compression ratio, this points to character variables that are defined longer than needed.

Inspect your charvars like

proc sql;
select max(length(charvar)) from have;
quit;

so you can do corrections of defined lengths before uploading to CAS; you can also check if there is a problem in your data import process that leads to overlong defined character variables (import from Excel tends to do that).

SASKiwi
PROC Star

According to the doc try this:

proc casutil;
load data=work.test (compress) outcaslib="public"
casout="test" promote;
run;
ErikLund_Jensen
Rhodochrosite | Level 12

Hi @Muandre 

 

We ran into the same problem and found out that CAS supports the data type VARCHAR, so we dropped all experiments with compress.

 

We write the CAS table with a data step that defines new CAS variables with following statements:

  • length newvar varchar(n);  *where n = longes string in input;
  • newvar = oldvar;
  • drop oldvar;

and so on..

 

In our setup the code is built into a DI Studio transformation that generates the data step from the actual input column specifications with proc execute. The incoming variables are renamed in the set statement, so the original names can be reused as new variable names.

 

Muandre
Obsidian | Level 7

Thank you - our experiments with compress in proc casutil didn't help (well - they helped with size but as I learnt today: "CAS is built for speed and not for space, so compression is not wanted."

Working with length in our oracle-view helped <3. Thank you!

SASKiwi
PROC Star

@Muandre  - Your experience mirrors ours with the SAS VA LASR servers on SAS 9. You trim character variables to optimal length rather than compressing datasets.

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 7 replies
  • 1324 views
  • 6 likes
  • 4 in conversation