DATA Step, Macro, Functions and more

CALL SYMPUTX and uninitialized variables

Accepted Solution Solved
Reply
Occasional Contributor
Posts: 6
Accepted Solution

CALL SYMPUTX and uninitialized variables

If I run this code:

 

data _null_;
 x=y;
 call symputx ('MacroVariable',NonExistentVariable);
 run;

%put &MacroVariable;

 

Then I get the uninitialized note to tell me that Y was uninitialized, but I do not get a similar note to tell me that NonExistentVariable is uninitialized. Surely I should?

 

It annoys me enough that SAS only returns a note about uninitialized variables (as opposed to a warning) but this does not even do that.

 

I've tried Googling this to see if anyone else has any thoughts on the matter but I didn't find anything obvious. This seems like a bug or at least, something that should be changed. But I thought I'd post here to see if anyone had any thoughts on the matter. I'm considering raising this with SAS. (I realise I could be alone here thinking this is a problem, but as Winston Smith said in 1984, being a minority of one doesn't make you mad. Or something like that.)


Accepted Solutions
Solution
3 weeks ago
PROC Star
Posts: 1,471

Re: CALL SYMPUTX and uninitialized variables

Since you're new here @John6 (welcome!) just wanted to point out that there is a community for SAS ballot ideas, i.e. proposals for changes to SAS behavior / new functionality.

 

https://communities.sas.com/t5/SASware-Ballot-Ideas/idb-p/sas_ideas

 

Since this is documented behavior that CALL routines don't generate the uninitialized note, it's not really a bug.  And SAS isn't likely to change the default behavior to make it a note (or warning).  But SAS has been adding system options over time, that make it possible to turn on more and more notes/warnings/errors to the log.

 

I could imagine proposing an option like CALLVARINITCHK=NONOTE|NOTE|WARN|ERROR, which would control whether CALL routines throw a log message when a CALL routine initializes a variable (i.e. creates a variable was not defined elsewhere in the DATA step).  Might not have much likelihood of getting implemented, but I'd vote for it. : )

 

 

View solution in original post


All Replies
Super Contributor
Posts: 359

Re: CALL SYMPUTX and uninitialized variables

[Deleted answer] Sorry I misread your question. Ignore my previous answer.

Valued Guide
Posts: 629

Re: CALL SYMPUTX and uninitialized variables

You are right, there should be a note in the log about NonExistendVariable in the log.

Imho any undeclared or un-initialised variable must result in an error message. And there should not be any option to turn this off. Unfortunately, SAS is different.

Occasional Contributor
Posts: 6

Re: CALL SYMPUTX and uninitialized variables

Posted in reply to andreas_lds
Thanks. I was going to say it should be an error as well, but I was thinking, this is my first post, do I really want to stick my neck out that far... ;-)
Super User
Super User
Posts: 9,840

Re: CALL SYMPUTX and uninitialized variables

You are mixing up two very different systems.  

Base SAS - this is the programming language and the one that is compiled.  At compile time the compiler checks your datastep and encounters a variable of y which is not in the dataset, therefore it puts a warning out to the log.

 

Macro SAS - this is an additional tool which is used to create text, nothing more.  It is a find replace system.  The output text is then (mostly, not always) fed into the Base SAS compiler above.  

 

So one system is not going to check the other, macro pre-processor will check and resolve the macro part of the code, the Base SAS compiler will check the Base SAS code.  Two completely different systems and very different in the use.  

 

In your code, the call symputx creates a macro variable - not a datastep one.  Two very different systems and therefore it is a different way of looking at each.  For example, in this run the variable might not exist, but another it might, so the macro compiler which generates the text can process each.  With datastep, the logic gets compiled, and has to be absolute, hence the warning.

Occasional Contributor
Posts: 6

Re: CALL SYMPUTX and uninitialized variables

I'm quite happy I understand everything. CALL SYMPUTX assigns a value to a macro variable during the datastep execution (as opposed to before compile time). But the value it is going to assign is in NonExistentVariable. Which is aptly named as it doesn't exist (as a variable in the PDV).

 

As such I feel there should be a note put out to the log, in the same way as there will be when Y (which doesn't exist either) is assigned to X.

 

I only came across this recently when I deliberately misnamed the variable in a CALL SYMPUTX statement to show off that it would generate the note. And was shocked to see it didn't! And it left me wondering how many typos I've left behind me in CALL SYMPUTX statements...

 

I used to use SYMPUT. At least this puts out the note about the numeric (missing) value being converted to a character value before being assigned. Which would have drawn my attention to the error in the past. (As an uninitialised variable is created as a numeric variable during the compilation phase but macro variables are always character variables.)

Super User
Super User
Posts: 9,840

Re: CALL SYMPUTX and uninitialized variables

Maybe I didn't explain well, let me add some code to what you presented:

data _null_;
 x=y;
 call symputx ('MacroVariable',NonExistentVariable);
 put _all_;
run;

%put &MacroVariable;

If you run this you will see that x, y, nonexistentvariable are all created in the PDV.

NOTE: Variable y is uninitialized.
x=. y=. NonExistentVariable=. _ERROR_=0 _N_=1

The difference here is that x and y are both being used in terms of the datastep compiler, so as x gets set to y which is not initialised then the warning is created.  NonExistentVariable on the other hand is not used in any datastep part, only in the macro part, and the macro part converts what is there, a missing, to character missing.  No assignment is done, merely the creation of a macro variable which is empty. 

Occasional Contributor
Posts: 6

Re: CALL SYMPUTX and uninitialized variables

Yes, I see what you mean. But I do feel that there should be a note (if not warning but that's a different subject). Typos are very common reasons for creating uninitialised variables.

If you assign a macro variable to another macro variable, eg

%let NewVar = &NonExistentVar;

Then you will get a warning in the log about the apparent symbolic reference not being recognised.

The lack of note in my example seems to be inconsistent.
Super User
Super User
Posts: 9,840

Re: CALL SYMPUTX and uninitialized variables

But in this instance:

%let NewVar = &NonExistentVar;

The macro system has to lookup the macro variable nonexistantvar, which it cannot find, therefore there is a problem.

 

I can understand that it looks odd, but keep in mind the two different systems.  For instance, how would the macro processor know that the Base SAS PDV system created one variable and it wasn't always blank, so to the macro system:

data _null_;
  x=y;
  call symputx ('MacroVariable',NonExistentVariable);
run;

/* And */

data have;
  NonExistentVariable="";
run;

data _null_;
  x=y;
  call symputx ('MacroVariable',NonExistentVariable);
run;
  

Look exactly the same, how would it know that the former is creating the variable in PDV at compile time, and the second has that as data, i.e. in one it should throw a warning but not in the other, it gets quite messy overlap.

Valued Guide
Posts: 629

Re: CALL SYMPUTX and uninitialized variables

@RW9: the problem has nothing to do with the macro-compiler or macro-processor.

 

In the starting post both variable y and NonExistentVariable are used in datastep before they are initialised. And while the datastep-compiler recognises the missing initialisation of y, it fails to recognise the missing initialisation of NonExistentVariable. This is the problem. And i would love to understand why the very same compiler sees no problem in NonExistentVariable not being initialised before it is first used (as parameter of call symputx).

PROC Star
Posts: 1,471

Re: CALL SYMPUTX and uninitialized variables

I agree with @John6, and I don't think this can be blamed on the macro processor.

 

Consider:

 

63   options VARINITCHK=ERROR ;
64
65   data _null_ ;
66     put _ALL_ ;
67     call symputx("mvar",NonExistentVar) ;
68   run ;
NonExistentVar=. _ERROR_=0 _N_=1

 

 

The data step compiler saw the reference to NonExistentVar, and created the variable in the PDV, not the macro processor.   It's the data step compiler, not the macro processor, which is responsible for generating the UNINITIALIZED VARIABLE note.  Just like it would if NonExistentVar was used in an assignment statement or elsewhere. 

 

It's even more obvious if you STOP the execution of the step before the CALL SYMPUT executes.  The compiler has still done it's work:

 

84   data _null_ ;
85     put _ALL_ ;
86     stop ;
87     call symputx("mvar",NonExistentVar) ;
88   run ;

NonExistentVar=. _ERROR_=0 _N_=1

 

Note @John6, the VARINITCHK system option lets you turn the uninitialized NOTE into an ERROR, which is great:

 

 

69
70   data _null_ ;
71     y=NonExistentVar ;
72   run ;

ERROR: Variable NonExistentVar is uninitialized.
NOTE: The SAS System stopped processing this step because of errors.

73
74   data _null_ ;
75     put NonExistentVar ;
76   run ;

ERROR: Variable NonExistentVar is uninitialized.
NOTE: The SAS System stopped processing this step because of errors.

 

The documentation for VARINITCHK doesn't read that clearly to me:

Here are some of the contexts where a variable might not be initialized:
  • the variable appears on the left side of an assignment operator or the SUM statement
  • the variable is a parameter to a CALL routine
  • the variable is contained in an array
  • the variable can be set by a SET, MERGE, MODIFY, or UPDATE statement
  • the variable is specified in an INPUT statement
  • the variable is initialized in a RETAIN statement

 

But basically, I think it's listing the cases where a variable does not need to be explicitly initialized.  I think that's saying that when a variable is used as an argument to CALL ROUTINE, the variable does not need to be initialized (and the data step compiler will initialize it as a numeric variable, without throwing a note).  I think that's unfortunate, as I generally want more notes (or better yet errors) in my log.  I wrote a paper for SGF about the benefits of offensive programming https://www.sas.com/content/dam/SAS/support/en/sas-global-forum-proceedings/2018/1793-2018.pdf.

 

 

Super User
Super User
Posts: 8,279

Re: CALL SYMPUTX and uninitialized variables

[ Edited ]

It is because you are using a CALL function.  CALL functions CAN modify their inputs.  For example: CALL MISSING() or CALL SORT().  So the compiler has no way to know whether they DID modify the values and so removes those variables from the list of ones that are obviously uninitiated.

 

You will see similar behavior for variables that are included in ARRAY statement, but never assigned a value.

 

You are perhaps looking for a run-time check of whether the variable was every assigned a value and that would be a lot harder to create.

Super User
Posts: 10,570

Re: CALL SYMPUTX and uninitialized variables

@Tom has hit it, IMO.

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
How to post code
PROC Star
Posts: 1,471

Re: CALL SYMPUTX and uninitialized variables

Can you explain more, @Tom?  

 

The ARRAY statement is designed to allow the creation of variables.  So it makes sense to me that it wouldn't throw an uninitialized note (though I have sometimes wanted a switch that would allow me to turn off the ability of array statement to create variables, in favor of a note).

 

But CALL routines are not generally designed to create variables.  (CALL MISSING can create variables, which is a handy exception).  

 

The compiler certainly knows that CALL SYMPUTX referenced a variable that doesn't exist in the PDV.  Why shouldn't it throw the uninitialized note, rather than silently initialize a variable?  Similar for CALL SORTN.

Super User
Super User
Posts: 8,279

Re: CALL SYMPUTX and uninitialized variables

CALL routines are (in general) needed when the parameters are being modified.  Otherwise just make a normal function instead. It can either be needed as a way to return more than one value (look at CALL SCAN() vs SCAN()) or because that is just what the function does (look at  CALL MISSING() or CALL SORTN()).

 

CALL SYMPUTX() and the older CALL SYMPUT() version cannot actually modify their values, but I suspect that the SAS coder that created the logic for whether or not to include such variables in the list of uninitiated variables did not add that level of detail to the routine and just assumed that all variables referenced by CALL xxx() functions should not be considered uninitiated .

☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 20 replies
  • 159 views
  • 6 likes
  • 9 in conversation