BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Ps8813
Fluorite | Level 6
Hi as we know length, format, informat are compile time statement in datastep.
In below code :-
Data a;
Input chr $ num;
Length chr $6;
Length num 6;
Format chr $3.;
Format chr $4.;
Run;

It will give num(length=6) and chr(length=8 and format=$4.)
Why it is taking latest format not the first format it encounter ?
Why it is taking latest length in numeric and first length in character?

Look into below code:-
Data test;
Set a1 a2;
Run;

Above for same variables, attribute will be taken by first dataset i.e. a1 . but in earlier example we see that format takes latest value so by that sense it should take attributes of 'a2' dataset??
1 ACCEPTED SOLUTION

Accepted Solutions
Astounding
PROC Star

You're actually asking a question that is more advanced than it seems.  At a simplistic level, the answer is that those are the rules.  At a more advanced level, you're asking about some of the details that go on during DATA step compilation. 

 

As part of the compilation process, SAS has to set up the PDV:  storage locations in memory to hold the current values of every variable.  Once the PDV defines a length for a variable, that length cannot change.  That means since the INPUT statement defines a length for CHR, that length cannot change.  In fact, the later LENGTH statement for CHR should give you an error message to that effect.  However ...

 

Numeric variables always have a length of 8 in the PDV.  So a LENGTH statement for a numeric variable only affects what goes into the output data set, not the PDV.  LENGTH statements for numeric variables follow the opposite pattern ... last length assigned is the one used in the output data set.

 

Some of the results of these ideas:

 

  • To change the length of an EXISTING character variable, the LENGTH statement must come before any other statement that might define the variable's length, such as a SET statement. 
  • To change the length of an EXISTING numeric variable, the LENGTH statement can go anywhere in the DATA step. 

Hoping this helps, rather than confuses ... best of luck.

View solution in original post

4 REPLIES 4
Astounding
PROC Star

You're actually asking a question that is more advanced than it seems.  At a simplistic level, the answer is that those are the rules.  At a more advanced level, you're asking about some of the details that go on during DATA step compilation. 

 

As part of the compilation process, SAS has to set up the PDV:  storage locations in memory to hold the current values of every variable.  Once the PDV defines a length for a variable, that length cannot change.  That means since the INPUT statement defines a length for CHR, that length cannot change.  In fact, the later LENGTH statement for CHR should give you an error message to that effect.  However ...

 

Numeric variables always have a length of 8 in the PDV.  So a LENGTH statement for a numeric variable only affects what goes into the output data set, not the PDV.  LENGTH statements for numeric variables follow the opposite pattern ... last length assigned is the one used in the output data set.

 

Some of the results of these ideas:

 

  • To change the length of an EXISTING character variable, the LENGTH statement must come before any other statement that might define the variable's length, such as a SET statement. 
  • To change the length of an EXISTING numeric variable, the LENGTH statement can go anywhere in the DATA step. 

Hoping this helps, rather than confuses ... best of luck.

Ps8813
Fluorite | Level 6

Great Astounding. I got the point regarding length .

 

I have 1 more doubt.

 

data one;
attrib z length =$20. informat=$7. ;
run;
data two;
input z : $15.;
set one;
datalines;
test
;
run;

 

if i will check  variable attribute of z in dataset two it shows:-

length 15

format $15.

informat $7.

 

Why informat is $7. ?

 

Thanks for giving clear picture regarding length 🙂

FreelanceReinh
Jade | Level 19

@Ps8813: When I run your code with SAS 9.4, variable Z is not assigned format $15. and I wouldn't know where this format should come from, because it has not been specified anywhere in your code.

 

As to informat $7.:

The informat specification :$15. in your INPUT statement does not permanently associate an informat with variable Z. This informat is only used to perform modified list input. So, the first permanent assignment of an informat to variable Z occurs when the header information from dataset ONE is processed by the compiler.

 

Please note that at this point there is already an entry for Z in the PDV from the INPUT statement which precedes the SET statement. Due to the informat specification in the INPUT statement, Z has been created as a character variable with length 15. Now, the compiler detects that in dataset ONE variable Z is contained as a character variable with length 20>15, but the length of Z in the PDV cannot be changed anymore. Therefore, the compiler issues a warning message:

WARNING: Multiple lengths were specified for the variable z by input data set(s). This can cause truncation of data.

(This warning would not occur if the length of Z was <=15 in dataset ONE.)

 

The other variable attributes, here it's only informat $7., are taken from dataset ONE, because they have not been specified yet. Similarly, a permanent format or a variable label would be taken from ONE, unless they had been specified otherwise prior to the SET statement. FORMAT, INFORMAT or LABEL statements after the SET statement could change the corresponding variable attributes again, but a LENGTH statement would not work.

Astounding
PROC Star

This sounds correct.  To pick out a couple of key points:

 

  • The INPUT statement does not assign any informats.  (Any informats that appear as part of the INPUT statement are instructions for executing the INPUT statement, not permanently-assigned informats.)
  • The INPUT statement can assign lengths to variables that do not yet have a length.  (It cannot change already-assigned lengths.)

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 4 replies
  • 3149 views
  • 7 likes
  • 3 in conversation