Hello everyone,
I'm relatively new to SAS, having previously worked extensively with R. Currently, I'm using SAS during my co-op placement. While reviewing the code written by the previous co-op student, I came across something that confused me.
The SAS code contains a statement like this
"if cough =1 or wheeze=1 or ... =1, then symp=1".
However, I couldn't find any columns with names like cough
, wheeze
, or others mentioned in the SAS dataset that was read.
In R, it will only work/run when it has the column named as this. Could anyone help me with this?
Maxim 2: Read the Log.
If a variable is used which is not in an incoming dataset, or assigned a value somewhere in the code, an "uninitialized" NOTE is written to the log.
Hi Kurt,
If a variable is used which is not in an incoming dataset, but the code could run. There is no error in SAS log.
Thanks.
Chloe
Sorry, I mean the variables are used not in my read in the dataset. At least, no column name matched.
Hi Reeza,
Due to confidentiality, I revised part of the variable names. But the partial code would look like that,
libnmae data1 "xxxx\xxxxx\xxxxx\"; /*Path*/
data data1;
set data1.xxxxxxxxx;
run;
data data2;
set data1;
if cough =1 or phlegm =1 or wheeze =1 then coughsymp =1; else coughsymp =0;
if xxxx =1 or xxxx= 1 or xxxx =1 then zzzz =1; else zzzz=0;
run;
But after running the code, there is no error in the log and no such column in the read-in dataset. Therefore, I got confused.
Did you run a proc contents on your input data set?
proc contents data=data1.xxxxx;
run;
EDIT: From this, look at the variable name and label. It's possible the variable has a label so you're not seeing the variable name but the label instead.
Your won't get an ERROR message since it is not an error (other than a logic error).
But if the variables really do not exist then you should get notes.
Example:
271 data have; 272 id=1; 273 run; NOTE: The data set WORK.HAVE has 1 observations and 1 variables. NOTE: DATA statement used (Total process time): real time 0.01 seconds cpu time 0.00 seconds 274 275 data want; 276 set have; 277 if cough =1 or wheeze=1 then symp=1; 278 run; NOTE: Variable cough is uninitialized. NOTE: Variable wheeze is uninitialized. NOTE: There were 1 observations read from the data set WORK.HAVE. NOTE: The data set WORK.WANT has 1 observations and 4 variables. NOTE: DATA statement used (Total process time): real time 0.01 seconds cpu time 0.00 seconds
But it is possible to write a step where there is no obvious place where COUGH is assigned a value, but the uninitialized note does not appear. The most obvious it to include the variable in an array.
So in this example COUGH is included in the array so it no longer generates the note.
286 data want; 287 set have; 288 array x cough ; 289 if cough =1 or wheeze=1 then symp=1; 290 run; NOTE: Variable wheeze is uninitialized. NOTE: There were 1 observations read from the data set WORK.HAVE. NOTE: The data set WORK.WANT has 1 observations and 4 variables. NOTE: DATA statement used (Total process time): real time 0.00 seconds cpu time 0.00 seconds
@ashiah wrote:
Hi Kurt,
If a variable is used which is not in an incoming dataset, but the code could run. There is no error in SAS log.
Thanks.
Chloe
If you do not get the "uninitialized" NOTE, then the variable is either read from a dataset (or an external file with an INPUT statement) or is assigned a value somewhere in the DATA step.
@ashiah wrote:
Hi Kurt,
If a variable is used which is not in an incoming dataset, but the code could run. There is no error in SAS log.
Thanks.
Chloe
That is because it is NOT a syntax error. SAS is designed to allow you to reference variables that you have not yet defined. In that case the data step compiler will create the variable and make its best guess at whether you wanted a numeric or a character variable and if character what length it should have.
For a more complete answer post the complete log for the data step in question.
SAS as a language will go a bit to help prevent code from failing from, hopefully, minor errors.
One of those is creating variables when the code mentions them if they are not previously defined such as appearing in a source data set, or statements that are intended to create variables such as Length, Array, informat or format and label.
Even misspelling a variable will create one with the new spelling. just in case you meant to and just haven't provided the rest of the code. SAS generally does not throw an error but uses the new variable with a missing value. {Implied hint: long complex names are easier to misspell}
Another thing I would be concerned with inherited code is tracing everything back to when the data is read into SAS.
A common occurrence with new SAS users is a reliance on Proc Import or menus that basically invoke Import to read data. New files read with old code using Imporat can have different variable names because the column headings in the source file changed. So a variable that was previously "Cough" might become "Heavy_Cough" or something similar without actually changing the meaning of the variable just the name.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.