BookmarkSubscribeRSS Feed
nikhil_khanolkar
Calcite | Level 5

Hi,

We have a  compressed table in LASR server with the size of 11GB. Table has 160 million records and 33 variables. Around 25 of these are character variables.
Since dashboard consuming this table was giving slow performance, we uncompressed this table. Size of the uncompressed table rose to 680 GB from 11GB.

Compression ratio of the said table in LASR is 65%. With this ratio, size of the compressed data has been increased 61 times.

To improve the dashboard performance data needs to be stored in uncompressed form.  Are there any techniques/best practices that can be used to control the data size?

Thanks,

Nikhil

8 REPLIES 8
MichelleHomes
Meteorite | Level 14

Hi Nikhil,

I would definitely look at reducing the length of your character columns. If you are getting 65% compression it could be due to having a lot of padding at the end of your columns. If you have a column that takes up a max of 8 characters and is defined as a length of 50 then you have 150 million records of ~40 characters wasted space (and memory in uncompressed format).

I'd suggest an analysis comparing the defined column length and the actual length used (see Length Function in SAS(R) 9.4 Functions and CALL Routines: Reference, Third Edition) using a data step. Then determine what you actually need rather than 'nice to have' as the defined length of the column

A SAS Global Forum 2015 paper that shares some tips and tricks with data prep you may want to look at is http://support.sas.com/resources/papers/proceedings15/SAS1905-2015.pdf

Please let us know how you go.

Kind Regards,

Michelle

//Contact me to learn how Metacoda software can help keep your SAS platform secure - https://www.metacoda.com
Kurt_Bremser
Super User

My math understanding tells me a compression rate of 65% should lead to an increase of about 200% when compression is omitted. When data size increases by a factor of more than 60, the compression rate should be around 98.4%. This puzzles me somewhat.

LinusH
Tourmaline | Level 20

Agree.

If you are certain about your figures, I would open track to SAS tech support.

This is not a Star Schema?

Data never sleeps
nikhil_khanolkar
Calcite | Level 5

Hi Linus,

Please find below compression info log of the LASR table for your reference.

The IMSTAT Procedure

                                                      Compression Information

                                             Data Source          HPS.INVENTORY_MON_DS
                                             Table                HPS.INVENTORY_MON_DS
                                             Size                 6.8e+02GB          
                                             Compressed Size      11GB               
                                             Compression Ratio    65 

Thanks,

Nikhil

Kurt_Bremser
Super User

OK, so it is not percent, but the factor.

Look at your character variables first. I guess you have some that are defined quite big and mostly empty, or have repeat values that are easily compressed.

You can also gain with resizing numeric vars. Dates don't need 8 bytes, for instance.

LinusH
Tourmaline | Level 20

How do you know that it's the compression per se that slows down your dashboard? My guess would be that the "uncompression ratio" during execution slows down performance. Idon't know much abour LSAR compression, but a 98% in Base SAS would be considered sensational, and definitely signals that your data isn't modeled to it's purpose.

Data never sleeps
nikhil_khanolkar
Calcite | Level 5

Hi Linus,

In the  Reference guide on LASR server it is mentioned that "Compression exchanges less memory use for more CPU use. It slows down
any request that processes the data
". Having said that you are right in saying "uncompression ratio" during execution slows down performance, since it uncompressed every record while execution and with the compression ratio at hand its worth looking closely at the data modelling to reduce the size and hence improve the performance. Thanks a lot for your input.

SAS(R) LASR(TM) Analytic Server 2.4: Reference Guide

MichelleHomes
Meteorite | Level 14

Hi Nikhil,

You might also find this discussion thread useful

Kind Regards,

Michelle

//Contact me to learn how Metacoda software can help keep your SAS platform secure - https://www.metacoda.com

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

Tips for filtering data sources in SAS Visual Analytics

See how to use one filter for multiple data sources by mapping your data from SAS’ Alexandria McCall.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 8 replies
  • 3528 views
  • 1 like
  • 4 in conversation