DATA Step, Macro, Functions and more

substr function does not give error info when meeting undeclared variable.

Reply
Contributor
Posts: 47

substr function does not give error info when meeting undeclared variable.

I have a table of data cleaning rules in one excel sheet, with variables: table_name, statement, like the following:

table_name statement

------------------------------------

a        if ..... then .....

b        if birth_date=. then birth_date=input(substr(id,3,8)),yymmdd10.)

The excel file will be modifed by our customer, new data cleaning rules will be added or modified into it. During the ETL process, all rules will in the xls file will be loaded and checked syntax, and then take effect by a SAS macro.

The question is: for an undeclared vairable in a data set. the substr function does not give an error message showing the varible doesn't exist, but create a new one. so the above step of syntax checking will mleiss some errors.

Thanks!

Super User
Super User
Posts: 7,076

Re: substr function does not give error info when meeting undeclared variable.

SAS will generate notes to log about uninitialized variables.  You need to search for that when checking your SAS log for errors.

26   data _null_;

27     x=substr(y,3);

28   run;

NOTE: Numeric values have been converted to character values at the places given by: (Line)Smiley SadColumn).

      27:12

NOTE: Variable y is uninitialized.

Super Contributor
Posts: 474

Re: substr function does not give error info when meeting undeclared variable.

Hi.

You have to understand that no variable are created inside the datastep at runtime.

Datastep is processed in two distinct phases, compile and runtime.

It's at compile time that the layout(s) of the destination dataset(s) are defined, not at runtime.

The interpreter will scan the code and allocate every variable referenced in the code (on a memory area known as Program Data Vector, or PDV, that will output entirely or partially to the destination dataset(s)), sometimes doing some assumptions about type and length of new variables that aren't what you expected, hence the use of LENGHT statement to guide SAS interpreter to the right assumptions.

So when the code runs, everything is pretty well defined in terms of variables, lengths and types.

In you're case the SAS interpreter has allocated a new variable (and defaulted to numeric size 8) before the run time phase because it has seen the reference to it at compiler time. Hope the explanation is clear.

Cheers from Portugal.

Daniel Santos @ www.cgd.pt

Ask a Question
Discussion stats
  • 2 replies
  • 244 views
  • 0 likes
  • 3 in conversation