BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Tal
Pyrite | Level 9 Tal
Pyrite | Level 9
Hi,
My co-worker and I are running the same program on pc sas but remotly loginin in on 2 different SAS servers. The size of the datasets.my pc sas creates are twice the size hers does. How do you guys explains this. Anything i can change in the settings of my pc sas? Any idea please? Thx
1 ACCEPTED SOLUTION

Accepted Solutions
Reeza
Super User

Run a PROC CONTENTS and look for differences. ie does she have the compress option set on by default.

View solution in original post

15 REPLIES 15
Reeza
Super User

Run a PROC CONTENTS and look for differences. ie does she have the compress option set on by default.

Tal
Pyrite | Level 9 Tal
Pyrite | Level 9
By default you mean some settings in pc sas?
ChrisBrooks
Ammonite | Level 13

Whate Reeza is suggesting is that you both run this (substituting the name of your datasets for sashelp.class).

 

proc contents data=sashelp.class;
run;

Then check the output which should look something like this

 

compress.png

 

I've circled the bit to check - if hers says "Yes" and yours says "No" that's the answer - you can change the default setting by putting this in your autoexec,sas file - you should be aware though that it's possible for small compressed files to be slightly larger than their uncompressed versions.

 

options compress=yes;
Tom
Super User Tom
Super User

How did you check the size? What units did the size get reported in? Is it possible one is counting number of 512 byte blocks and the other is reporting number of Kilobytes?

Tal
Pyrite | Level 9 Tal
Pyrite | Level 9
I go to the unix folders through a ftp. Hers shows 3.2GB mine for example 7.8GB
Tom
Super User Tom
Super User

Depending on how your disks are configured on unix some will report the physical space used instead of the logic space used.  The du command on linux now supports an --apparent-size option.  I have some linux machines I am using now where there can be a large difference in the size reported depending on whether the --apparent-size option is used or not. 

That said it  is probably more likely that something else is different in how SAS ran the code.  Could be some minor difference in the code, oe perhaps you actually are reading different source files, or something about the settings like the COMPRESS= option, or even the blocksize that SAS used to write the observations.

Kurt_Bremser
Super User

@Tal wrote:
I go to the unix folders through a ftp. Hers shows 3.2GB mine for example 7.8GB

You should run proc contents on both datasets and compare the outputs. On a UNIX system, proc contents will also include the physical file size in the host dependent section.

Tal
Pyrite | Level 9 Tal
Pyrite | Level 9
Chris we both use exactly the same program. If she has that global option compress=yes then I d have it too but i will still run proc contents.

Thx all for the advices
ChrisBrooks
Ammonite | Level 13

The point is the options statement doesn't have to be in the program - autoexec.sas (if it's present in the correct location) will run at the start of every session and can set options for the whole session unless they are changed.

Tal
Pyrite | Level 9 Tal
Pyrite | Level 9
But still this autoexec.sas needs to be included in her program which also be included in mine.no?
ChrisBrooks
Ammonite | Level 13

No it's a seperate file - you can find out more details about it at this link

Reeza
Super User

She could also have set a preference or option differently in the same session or previously. Did PROC CONTENTS show now difference between the datasets? You could also run PROC COMPARE to see what it says the differences are between the two datasets to determine any differences. The OBS option is another that can cause issues. 

 

I'm assuming you've already checked the obvious that you both have the same versions and that the datasets created have the same number of observations and variables.  

 

A common example is someone using EG and someone else using Base. EG defaults with VALIDVARNAME=ANY while Base defaults to VALIDVARNAME=V7. Thus the exact same code in EG and Base can provide different answers. 

 


@Tal wrote:
But still this autoexec.sas needs to be included in her program which also be included in mine.no?

NO.

 

Tal
Pyrite | Level 9 Tal
Pyrite | Level 9

so  she ran proc contents and  she  has the  compress -char option set and  i have  compression-no

we also  found the autoexec.sas  file in one of the  sharedrives. Does  that now means  her  pc sas connects to  that autoexec  file and  mine does not?

 

Or  if i  just use systen option compress=char ?

 

Kurt_Bremser
Super User

I would not use the system option. Datasets that have no (or very short) strings may even increase in size.

Use the compress=yes dataset option, and look at the log what is reported. Keep the option only if a dataset is reduced significantly in size.

What constitutes "significantly" is up to you. After all, 5% of 20 GB is still 1 GB.

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 15 replies
  • 3611 views
  • 5 likes
  • 5 in conversation