BookmarkSubscribeRSS Feed
juanvg1972
Pyrite | Level 9

Hi,

I am using the compression option of SAS and the compression ratio I get is very low.

I haved used binary and char methods and the compression obtanided is 2% and 8%.

I put my code. Is there any other way of compression more effective?

Any help will be greatly aprecciatted:

data detalle(drop = i);;
length num1 num2 8. campo1 campo2 campo3 $10.;
do i = 1 to 1000000;
num1 = i;
num2 = round(20*ranuni(1));
campo1= 'aaaaaa';
campo2 = compress('P'||round(10*ranuni(1)));
campo3 = byte(65 + round(ranuni(1)*25));
output;
end;
run;

data comprimida(compress=BINARY);
set detalle;
run;

data comprimida(compress=YES);
set detalle;
run;

data comprimida(compress=char);
set detalle;
run;

proc contents data=detalle; /* 48,3 Mb */
run;

proc contents data=comprimida; /* 44,3 Mb */
run;

2 REPLIES 2
SASKiwi
PROC Star

Compression is only useful when you have a larger number of columns and/or long character columns with lots of blank space. Try making your character columns 100 or 200 characters long.

Astounding
PROC Star

Those rates are low, but they are a reflection of the data you are compressing.

CHAR compression works best when you have repeated characters (such as missing values in your character variables).  It would also work better if you had integers as your numeric values.

BINARY works well when you have patterns of values that repeat.  In addition to the CHAR categories, that would also include missing values for numeric variables.

You can easily change the characteristics of your data to those that are better suited to compression.  But it's more likely you should choose data characteristics that more closely approximate what you expect in your real life data before deciding on the better method.

Good luck.

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 2 replies
  • 986 views
  • 0 likes
  • 3 in conversation