DATA Step, Macro, Functions and more

BASE SAS Varying lengths of VARs when using CATX Character Manipulation

Accepted Solution Solved
Reply
Contributor
Posts: 26
Accepted Solution

BASE SAS Varying lengths of VARs when using CATX Character Manipulation

Hi,

I am studying for the Base Certification Exam and I have some questions over how SAS assigns length to VARs after using a character function such as CATX. So here are the things I think I know with CATX:

 a)   the default byte length is 200

 

 b)   the length statement can define the byte lenght to anything (when used at the top of the data step)

 

 c)   the newly created VAR from CATX will assume the byte length of all the concatenated values such as byte 5!! byte2!! byte7 = NewCharacterVAR of 14bytes 

My question then is, when are any of the character manipulation functions returning a value of 200 when separating and concatenating? You are always creating a new character value when using these, correct? So are they not always going to be the sum of the VARs that make them up? I am really confused over the rules of character manipulation and anyone who could shed some light on this or point me in the direction of additional reading material would be greatly appreciated.

 

 

Thanks for all your help and time -

 


Accepted Solutions
Solution
‎09-14-2017 01:43 PM
Super User
Super User
Posts: 8,279

Re: BASE SAS Varying lengths of VARs when using CATX Character Manipulation

[ Edited ]

The best way to know is to test!  Write some code and see what SAS does with it!

 

The key thing is that you should always define your variables instead of forcing SAS to guess what you want.

If it knows nothing about the variable other than that it is character then it will default to $8.  

input str $ ;

But if it can see that you are creating the variable from other variables then if can try to figure out how big it should be.

length x $10;
y=x;

It can even figure some that involve operators or functions.

length a b c $10;
x =a||b;
y=substr(c,1,5);

But in general when it cannot figure out length of the function result will be for every possible input then it will use $200.

length a b c $50;
x = catx(',',a,b,c);

Also watch out for using these in PROC SQL as some of them have shorter limits on what they can return when used in PROC SQL (at least they used to).

View solution in original post


All Replies
Contributor
Posts: 26

Re: BASE SAS Varying lengths of VARs when using CATX Character Manipulation

Do you think I am confusing CATX and CAT?
Solution
‎09-14-2017 01:43 PM
Super User
Super User
Posts: 8,279

Re: BASE SAS Varying lengths of VARs when using CATX Character Manipulation

[ Edited ]

The best way to know is to test!  Write some code and see what SAS does with it!

 

The key thing is that you should always define your variables instead of forcing SAS to guess what you want.

If it knows nothing about the variable other than that it is character then it will default to $8.  

input str $ ;

But if it can see that you are creating the variable from other variables then if can try to figure out how big it should be.

length x $10;
y=x;

It can even figure some that involve operators or functions.

length a b c $10;
x =a||b;
y=substr(c,1,5);

But in general when it cannot figure out length of the function result will be for every possible input then it will use $200.

length a b c $50;
x = catx(',',a,b,c);

Also watch out for using these in PROC SQL as some of them have shorter limits on what they can return when used in PROC SQL (at least they used to).

Contributor
Posts: 26

Re: BASE SAS Varying lengths of VARs when using CATX Character Manipulation

Thank you! Testing now

Contributor
Posts: 26

Re: BASE SAS Varying lengths of VARs when using CATX Character Manipulation

This was very helpful. Viewing the contents of these various scenarios was really helpful to understanding this. Thank you for your help and time.

Super User
Posts: 13,941

Re: BASE SAS Varying lengths of VARs when using CATX Character Manipulation

Reading the documentation can actually help. For instance from the online help for CATX:

 

Length of Returned Variable

In a DATA step, if the CATX function returns a value to a variable that has not previously been assigned a length, then that variable is given a length of 200 bytes. If the concatenation operator (||) returns a value to a variable that has not previously been assigned a length, then that variable is given a length that is the sum of the lengths of the values that are being concatenated.
 
So this paragraph shows why you may get different lengths with
y = catx(',', var1, var2);
and
y = var1||','||var2;
 
when y has not had a length assigned.
☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 5 replies
  • 181 views
  • 2 likes
  • 3 in conversation