BookmarkSubscribeRSS Feed
SushilNayak
Obsidian | Level 7
Hi All,
While having breakfast today, i thought of this. Im not sure if i have lost it yet 🙂 , but i could not explain myself the compilation and execution execution for the below step. Can someone help.

data x;
county='India';
country='Sri Lanka'
run;

the output of the dataset would be , Country='Sri L' . I understand that if length stament is not used before initialization of a new variable then SAS takes the 1st assignment of that new variable as the length of that variable.

What is confusing me is when i go through the papers available online and what i originally knew, the length assignment happens at the compilation step( pdv creation, data descriptor creation..etc.) and data is not read during the compilation step. So what does sas assumes as the length of during compilation. I ask this as, if this datastep uses an IF condition loop based assignment of country variable then also 1st assignment of the variable would be the length of the variable. So is it correct to say that in case of new variable creation(hardcoded values and not assignment from already created variable) the compilation doesn't do anything and execution is the only step for these variables.

Thanks!
5 REPLIES 5
sbb
Lapis Lazuli | Level 10 sbb
Lapis Lazuli | Level 10
Unfortunately, your code has two different variables, county and country, which somewhat invalidates your association to the post. Regardless....

The assignment statement, at compilation time, will determine a SAS variable type and length assignment, unless a LENGTH (and apparently a FORMAT statement) or an INPUT statement (with a declared or implicit INFORMAT specification).

For a numeric variable, the "last" LENGTH declaration associated with a variable will be used at the time an output SAS dataset/member is written, and not while the DATA step is executing (for precision considerations).

Scott Barry
SBBWorks, Inc.
DanielSantos
Barite | Level 11
As I see it, it depends. But, everything is in fact done at compile time.

On most times, SAS will manage to determine the length of the variable at compile time by the first assignment (if no explicit allocation exists, through LENGHT, INPUT or FORMAT), which on some cases may not be, correctly determined (see bellow).
If it fails, it will simply choose the default allocations size. 200 characters for alphanumeric vars and 8 bytes for numeric vars.

For example:

data _null_;
X1='XXX';
X2=substr('XXX',1,2);
X3=cats('XXX');
X4=repeat('X',int(ranuni(0)*10)+1);
L1=vlength(X1);
L2=vlength(X2);
L3=vlength(X3);
L4=vlength(X4);
put _all_;
run;

Returns:
X1=XXX X2=XX X3=XXX X4=XXXXXXXXXX L1=3 L2=3 L3=200 L4=200 _ERROR_=0 _N_=1

X1's size was correctly determined (3).
X2's size is somehow incorrect, the result of substr will produce a 2 char text, but instead, the length of the source expression (3) was choosen.
X3 and X4 were too complex, and therefore allocated with the default size (200).

Cheers from Portugal.

Daniel Santos @ www.cgd.pt.
SushilNayak
Obsidian | Level 7
Thanks! Daniel and Scott. I think i have got it, the sas during compilation does find the length based on first assignment of the variable to get the variable length eventhough the value is hardcoded text(actually the hardcoded text made me into thinking that the text was being read to get the length and so it is execution step which gets the length). For ex. in the below code, the length of country is assigned to 5.

data x;
stop;
country='India';
country='Sri Lanka';
run;
proc contents;run;

Daniel - For X3 and X4, it's not the complexity but the function default length allocation which is causing 200 to show up on vlength function execution. For repeat and cat/x/s/t the default length of the left side variable is always 200, if we don't provide a length statement in advance to the left side variable first assignment. x=repeat('a',20) would also give length of 200.

Thanks Again!!
DanielSantos
Barite | Level 11
Interesting discussion here.

But, I must disagree on your last statement, about the default allocation of 200 characters.

It has everything to do with the complexity of the function (if the assigned variable has not explicitly been allocated previously), or about the non supervisor functions, or again, and more precisely, about SAS not being able to determine the correct size of the first assignment at compile time.

This matter as been discussed before in the SAS-L, see the complete post here:

http://www.listserv.uga.edu/cgi-bin/wa?A2=ind0703a&L=sas-l&D=0&O=D&P=24681

And thank you Sushil.
It's nice to see from time to time, some discussion other than the traditional "how do I do this" problem!

Cheers from Portugal.

Daniel Santos @ www.cgd.pt
SushilNayak
Obsidian | Level 7
Cool post Daniel.... Honestly the way i learnt functions was by practice, execution & Ronald P. Cody book on SAS Functions by example..by constant usage i have been able to remember which functions give default 200 length or input variable length itself...wierd lengths have not yet happend with me :)...Thanks for the SAS-L post link..i never knew about the subgroups in function ( Library and Supervisor Functions 😞 which makes this length assignment happen the way they do) . Thanks a ton!!!!!!!!! for the info 🙂

SAS Innovate 2025: Register Now

Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 5 replies
  • 1561 views
  • 0 likes
  • 3 in conversation