BookmarkSubscribeRSS Feed
iSAS
Quartz | Level 8

I tried PROC CPORT on my data set expecting that the file size will become smaller but it turned out to be the opposite. The size of the file, in .xpt format, become bigger. May I know the reason behind this, is this issue common?

12 REPLIES 12
SASKiwi
PROC Star

CPORT files are standardised so they can be easily exchanged between computers with different operating systems and / or different SAS versions. They are not optimised to reduce storage. If the SAS datasets you CPORT are originally compressed, then I'm not surprised they would end up bigger.

 

Also CPORT files have a fixed blocksize so yet another reason they could be bigger:

 

Verifying That the Communications Software Has Not Changed File Attributes

Verify that your communications software does not change file attributes. Here are the required attributes with values:
Logical record length (LRECL)
80 or an integer that is a multiple of 80 (for example, 160, 240, 320)
Block size (BLKSIZE)
8000 blocks
Record format (RECFM)
Fixed block
RW9
Diamond | Level 26 RW9
Diamond | Level 26

What is it your actually trying to achieve here, in terms of this post and your other one on Gzip?  Transport files are just that, a method of transporting a file from one operating system to another, they are not compressed files.  What do you need compressed files for?  What is the process you are trying to achieve?

Kurt_Bremser
Super User

The xpt format is used for transport of data between operating systems and SAS versions, it is not meant to be used for reducing storage.

If you need a radical reduction in used space, use gzip or similar.

Reasoning:

  • both methods require one step to make the file readable for SAS again (proc cimport, or gzip -d)
  • the compression achieved with gzip is much better than that of internal SAS means (compress=yes, compress=binary)
iSAS
Quartz | Level 8

Thanks to all who responded to my questions.

 

 

Our current situation/issue is we will soon be migrating from SAS 9.3 to 9.4 and we need to compress our data sets as much as we could since we're running out of space on our server. We looked for ways on how to compress the files and decided to combine proc cport and gzip since we tested one data set  on this and its size were compressed from 65GB to just 212MB. However, upon testing this to other data sets, we notice inconsistencies like after using proc cimport and gzip -d the size of the data set isn't the same prior to compressing. This concerned us since of thinking that the data set might not be fully equal with the original one. So for now, we're just using gzip since it never affected the size of dataset even after unzipping.

SASKiwi
PROC Star

You can use PROC COMPARE to ensure that your migrated data is exactly the same as your original data. Create a file share to your 9.3 data from your new SAS server so you can easily compare tables. If you are changing operating systems that might make it a bit harder to compare. Please advise us what operating systems both of your SAS servers run on.

iSAS
Quartz | Level 8

We used proc compare after cimport and there is no difference from the original table. However, they have different file size. I think one of the reason it has difference on size is because compress option was done when the original table was created. When we cport and cimport the table the file size went larger but there is no issue on proc compare. So it's still safe to conclude that the cimported data is exactly the same from the original data?

Kurt_Bremser
Super User

When the report from proc compare finds no differences, you're good. But that works on the logical (content) side of things. Physical file parameters are not checked, and later SAS versions tend to use a larger pagesize, which results in different file layouts and sizes.

If the original datasets were compressed, proc cport/cimport should honor that and compress the imported datasets.

Kurt_Bremser
Super User

I would strongly advise to do a direct jump to the current SAS version (9.4M5). Otherwise you'll have to do all the work again RSN. There are so many important additions in 9.4 that it makes the intermediate step to 9.3 just a waste of time.

We did 9.2 to 9.4 recently, and since both servers run AIX, we just had to copy the files as they are.

RW9
Diamond | Level 26 RW9
Diamond | Level 26

My thoughts exactly, no reason to migrate to anything other than 9.4 at this moment in time.  And yes, only the 32bit v 64bit is the real killer but that should only affect catalogs (which are bad bad bad!). 

Also, if your server is running out of space, consider upgrading it before migration.  You can buy thousands of TB now so cheap it is untrue.

I assume you have the whole thing planned out, md5 hash checks on base files versus moved, backed up original, proc compares, code testing etc?

Kurt_Bremser
Super User

And some more thoughts regarding space issues:

  • use the compress options on all datasets, and look at the log to see if it has a reasonably positive effect (3% is not worth it)
  • expanding disk storage is not such an issue, given todays price/TB, even with high-end devices (SSD over FC)
  • for the sake of manageability, you should reduce the number of datasets present for online processing to what's really needed. Once you have the need to balance long term keeping vs storage availability, implement a suitable archival system. With us, this is handled by TSM.

And finally, I advise to not migrate in place on a given server. Set up a new one with the current SAS version, copy/migrate your data (depending on involved operating system(s), bitness or encoding issues), get it up and running, and once you've verified it's OK, switch your users over and decommission the old one. It's the most painless method for you and your users. As long as you run production on only one server, SAS (in my experience) won't object to having two installations in parallel for a limited time.

iSAS
Quartz | Level 8

My mistake guys sorry. We're migrating from 9.3 to 9.4

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 12 replies
  • 1649 views
  • 5 likes
  • 4 in conversation