BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Kastchei
Pyrite | Level 9

Hello,

 

Does anyone know the length needed for a numeric value that is one of the special missing values, .A - .Z?  Integers between -8192 and 8192 can be stored in a numeric variable that is only length 3.  Special missing values appear to work fine with length = 3, too, but I want to get a definitive answer.

 

Warm regards,

Michael

1 ACCEPTED SOLUTION

Accepted Solutions
Tom
Super User Tom
Super User

They work fine. Try it yourself.

data test;
  length x3 3 x4 4 x5 5 x6 6 x7 7 x8 8;
  array x x3-x8;
  do y=.,.z,.a,._ ;
    do over x;
      x=y;
    end;
    output;
  end;
run;
data _null_;
  set test;
  put (_numeric_) (=);
run;
x3=. x4=. x5=. x6=. x7=. x8=. y=.
x3=Z x4=Z x5=Z x6=Z x7=Z x8=Z y=Z
x3=A x4=A x5=A x6=A x7=A x8=A y=A
x3=_ x4=_ x5=_ x6=_ x7=_ x8=_ y=_
NOTE: There were 4 observations read from the data set WORK.TEST.

View solution in original post

10 REPLIES 10
LaurieF
Barite | Level 11

3 is the minimum. Unless you're on an IBM mainframe, when I think it's 2!

Tom
Super User Tom
Super User

They work fine. Try it yourself.

data test;
  length x3 3 x4 4 x5 5 x6 6 x7 7 x8 8;
  array x x3-x8;
  do y=.,.z,.a,._ ;
    do over x;
      x=y;
    end;
    output;
  end;
run;
data _null_;
  set test;
  put (_numeric_) (=);
run;
x3=. x4=. x5=. x6=. x7=. x8=. y=.
x3=Z x4=Z x5=Z x6=Z x7=Z x8=Z y=Z
x3=A x4=A x5=A x6=A x7=A x8=A y=A
x3=_ x4=_ x5=_ x6=_ x7=_ x8=_ y=_
NOTE: There were 4 observations read from the data set WORK.TEST.
heffo
Pyrite | Level 9

I think that @Kastchei knows that SAS can save the special missing values in 3 bytes. As I read it, he wants a formal answer on how many or how the data is stored.

I haven't seen any documentation about exactly how they are stored. 

SAS has documentation on how many bytes is needed to store an integer, Maximum Integer Size, but that does not mention how they store special missing values. On the same page they mention the special missing values, but they don't mention the storing of them there. 

So, I think there might not be an official answer. Instead we just have to hope that it works with 3 bytes and that they don't change anything in the future. 🙂 

Reeza
Super User
In SAS, 3 bytes is the minimum length for a numeric variable, so the minimum length needed is 3 bytes.

https://support.sas.com/documentation/cdl/en/hostwin/69955/HTML/default/viewer.htm#n04ccixfia6l2pn1f...
LaurieF
Barite | Level 11

I was asserting that it wasn't anything to worry about. From memory, they're stored in the length assigned, but only take up three bytes. Any length longer will just pad the extra with '00'x. That's in Windows - other operating systems will do it differently - big- and little-endian, and that sort of thing.

 

SAS wouldn't dare change it, because all the code in the world would break!

SASKiwi
PROC Star

If your objective is to conserve disk space, I suspect you will get get better mileage by compressing your SAS data. Setting a SAS default of COMPRESS = YES or COMPRESS = BINARY in all SAS sessions is now quite a common practice. Then you don't need to worry about optimising variable lengths at all.

Kastchei
Pyrite | Level 9

Yeah, most of the time I don't worry about it and just use compression.  But changing the length is more saving than compression (sometimes both help), so on really big datasets, I try to use both.

SASKiwi
PROC Star

Good to hear you are using compression. My personal approach is that it is cheaper and quicker to not bother with changing numeric variable lengths as disk space is plentiful and nowadays relatively cheap.

Reeza
Super User
It's also slightly dangerous IMO because the last thing you'll check will be your DB settings but if your data changes for some reason and no longer fits into those space it doesn't always generate an error (depends on your systems) so you don't catch truncation issues or data issues until later on and it's a pain to debug these types of issues. Defensive programming would mean leaving some extra space there. My hours of time saved are worth more than a little extra disk space.
LaurieF
Barite | Level 11

I generally concur, but not completely. And I say this as a big fan of compression.

 

If the PDV is predominantly numeric (especially with few missing values), or otherwise contains short fully populated character variables, compression can well increase the size of the dataset. The extra I/Os required to read the larger-than-necessary dataset, with the extra cost of decompressing, can thus extend the run-time considerably.

 

My rule of thumb is: if compression is less than 30%, don't bother.

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 10 replies
  • 1593 views
  • 14 likes
  • 6 in conversation