Help using Base SAS procedures

SAS Datasets size

Accepted Solution Solved
Reply
Super Contributor
Super Contributor
Posts: 444
Accepted Solution

SAS Datasets size

Hi,
My co-worker and I are running the same program on pc sas but remotly loginin in on 2 different SAS servers. The size of the datasets.my pc sas creates are twice the size hers does. How do you guys explains this. Anything i can change in the settings of my pc sas? Any idea please? Thx

Accepted Solutions
Solution
‎07-18-2017 02:55 PM
Super User
Posts: 19,768

Re: SAS Datasets size

Run a PROC CONTENTS and look for differences. ie does she have the compress option set on by default.

View solution in original post


All Replies
Solution
‎07-18-2017 02:55 PM
Super User
Posts: 19,768

Re: SAS Datasets size

Run a PROC CONTENTS and look for differences. ie does she have the compress option set on by default.

Super Contributor
Super Contributor
Posts: 444

Re: SAS Datasets size

By default you mean some settings in pc sas?
Super Contributor
Posts: 438

Re: SAS Datasets size

Whate Reeza is suggesting is that you both run this (substituting the name of your datasets for sashelp.class).

 

proc contents data=sashelp.class;
run;

Then check the output which should look something like this

 

compress.png

 

I've circled the bit to check - if hers says "Yes" and yours says "No" that's the answer - you can change the default setting by putting this in your autoexec,sas file - you should be aware though that it's possible for small compressed files to be slightly larger than their uncompressed versions.

 

options compress=yes;
Super User
Super User
Posts: 7,035

Re: SAS Datasets size

How did you check the size? What units did the size get reported in? Is it possible one is counting number of 512 byte blocks and the other is reporting number of Kilobytes?

Super Contributor
Super Contributor
Posts: 444

Re: SAS Datasets size

I go to the unix folders through a ftp. Hers shows 3.2GB mine for example 7.8GB
Super User
Super User
Posts: 7,035

Re: SAS Datasets size

Depending on how your disks are configured on unix some will report the physical space used instead of the logic space used.  The du command on linux now supports an --apparent-size option.  I have some linux machines I am using now where there can be a large difference in the size reported depending on whether the --apparent-size option is used or not. 

That said it  is probably more likely that something else is different in how SAS ran the code.  Could be some minor difference in the code, oe perhaps you actually are reading different source files, or something about the settings like the COMPRESS= option, or even the blocksize that SAS used to write the observations.

Super User
Posts: 7,757

Re: SAS Datasets size


Tal wrote:
I go to the unix folders through a ftp. Hers shows 3.2GB mine for example 7.8GB

You should run proc contents on both datasets and compare the outputs. On a UNIX system, proc contents will also include the physical file size in the host dependent section.

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
Super Contributor
Super Contributor
Posts: 444

Re: SAS Datasets size

Chris we both use exactly the same program. If she has that global option compress=yes then I d have it too but i will still run proc contents.

Thx all for the advices
Super Contributor
Posts: 438

Re: SAS Datasets size

The point is the options statement doesn't have to be in the program - autoexec.sas (if it's present in the correct location) will run at the start of every session and can set options for the whole session unless they are changed.

Super Contributor
Super Contributor
Posts: 444

Re: SAS Datasets size

But still this autoexec.sas needs to be included in her program which also be included in mine.no?
Super Contributor
Posts: 438

Re: SAS Datasets size

No it's a seperate file - you can find out more details about it at this link

Super User
Posts: 19,768

Re: SAS Datasets size

She could also have set a preference or option differently in the same session or previously. Did PROC CONTENTS show now difference between the datasets? You could also run PROC COMPARE to see what it says the differences are between the two datasets to determine any differences. The OBS option is another that can cause issues. 

 

I'm assuming you've already checked the obvious that you both have the same versions and that the datasets created have the same number of observations and variables.  

 

A common example is someone using EG and someone else using Base. EG defaults with VALIDVARNAME=ANY while Base defaults to VALIDVARNAME=V7. Thus the exact same code in EG and Base can provide different answers. 

 


Tal wrote:
But still this autoexec.sas needs to be included in her program which also be included in mine.no?

NO.

 

Super Contributor
Super Contributor
Posts: 444

Re: SAS Datasets size

so  she ran proc contents and  she  has the  compress -char option set and  i have  compression-no

we also  found the autoexec.sas  file in one of the  sharedrives. Does  that now means  her  pc sas connects to  that autoexec  file and  mine does not?

 

Or  if i  just use systen option compress=char ?

 

Super User
Posts: 7,757

Re: SAS Datasets size

I would not use the system option. Datasets that have no (or very short) strings may even increase in size.

Use the compress=yes dataset option, and look at the log what is reported. Keep the option only if a dataset is reduced significantly in size.

What constitutes "significantly" is up to you. After all, 5% of 20 GB is still 1 GB.

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 15 replies
  • 555 views
  • 5 likes
  • 5 in conversation