Data abc; set abc2 (Keep= firstname lastname patient_stat); format Product $10.; product = "ddd"; format PATIENT_STAT $25.; format sex2 $4.; if PATIENT_STAT = 'L' then PATIENT_STAT2 = "LBC"; if PATIENT_STAT = 'U' then PATIENT_STAT2 = "UBC"; rename = (patient_stat = Patient_stat2); drop patient_stat2; run;
Hello team,
I have this code on the top. The log doesn't show any errors. The program runs and produces the results. The problem is that rename appears as a variable name, as a numeric variable with all 0 across the column.
How can I fix this?
Regards,
Blue Blue
This
rename = (patient_stat = Patient_stat2);
Is an ASSIGNMENT statement, not a RENAME statement. It is creating a binary variable that will be 1 when the other two variables are equal and zero otherwise.
This is a RENAME statement.
rename patient_stat = Patient_stat2;
But it is probably NOT the rename statement you want since your code is creating the PATIENT_STAT2 variable. If your plan is to just throw away that new variable then remove the two IF statements that make it and the DROP statement.
Another problem with your code is there is no need to attach the $ format to character variables. SAS already knows how to print character variables so it needs no special formatting instructions for those. If you want to set the LENGTH of your character variables then do it explicitly instead of forcing SAS to GUESS what length you wanted them to have.
So this code will read in three variables and write out five. It will recode PATIENT_STAT and force its length to $25. It will add new variables PRODUCT and SEX2 with the constant values of 'ddd' an ' ' , respectively.
data abc;
set abc2 (Keep= firstname lastname patient_stat);
length Product $10 PATIENT_STAT2 $25 sex2 $4 ;
product = "ddd";
if PATIENT_STAT = 'L' then PATIENT_STAT2 = "LBC";
if PATIENT_STAT = 'U' then PATIENT_STAT2 = "UBC";
rename patient_stat2 = Patient_stat;
drop patient_stat;
run;
This
rename = (patient_stat = Patient_stat2);
Is an ASSIGNMENT statement, not a RENAME statement. It is creating a binary variable that will be 1 when the other two variables are equal and zero otherwise.
This is a RENAME statement.
rename patient_stat = Patient_stat2;
But it is probably NOT the rename statement you want since your code is creating the PATIENT_STAT2 variable. If your plan is to just throw away that new variable then remove the two IF statements that make it and the DROP statement.
Another problem with your code is there is no need to attach the $ format to character variables. SAS already knows how to print character variables so it needs no special formatting instructions for those. If you want to set the LENGTH of your character variables then do it explicitly instead of forcing SAS to GUESS what length you wanted them to have.
So this code will read in three variables and write out five. It will recode PATIENT_STAT and force its length to $25. It will add new variables PRODUCT and SEX2 with the constant values of 'ddd' an ' ' , respectively.
data abc;
set abc2 (Keep= firstname lastname patient_stat);
length Product $10 PATIENT_STAT2 $25 sex2 $4 ;
product = "ddd";
if PATIENT_STAT = 'L' then PATIENT_STAT2 = "LBC";
if PATIENT_STAT = 'U' then PATIENT_STAT2 = "UBC";
rename patient_stat2 = Patient_stat;
drop patient_stat;
run;
Hello team,
This statement:
rename = (patient_stat = Patient_stat2)
Doesn't read from right to left?
Don't we need format statement when we add a variable to our dataset? Is length statement by itself enough?
How do we add a variable to our dataset outside of assignment statement (We don't want to use that variable in an assignment statement in our code)?
Thanks for your response.
Blue blue
@GN0001 wrote:
Hello team,
This statement:
rename = (patient_stat = Patient_stat2)
Doesn't read from right to left?
Don't we need format statement when we add a variable to our dataset? Is length statement by itself enough?
How do we add a variable to our dataset outside of assignment statement (We don't want to use that variable in an assignment statement in our code)?
Thanks for your response.
Blue blue
In general statements are read LEFT to RIGHT and not RIGHT to LEFT. Parentheses can over ride that, but in this case they do not change the meaning of the statement. It is the first = that is making it an assignment statement. RENAME in that case is the name of the variable that is being assigned the value. The extra parentheses do not change the way the expression on the right of the assignment will be evaluated since it only has the one operator.
You are probably confusing the RENAME statement with the RENAME= dataset option. The syntax you posted would be closer to that needed for the RENAME= dataset option.
The format statement is used to attach formats to variables, hence its name. There is no need to use a FORMAT statement if you don't need to attach a format to the variable. There is no need to attach a format to most variables. The main exception would be DATE, TIME and DATETIME values as just printing the raw number of days or number of seconds they contain would be hard for humans to understand.
To define a new variable use a LENGTH statement (or the LENGTH= option of the ATTRIB statement). For numeric variables you can use a value from 3 to 8 (or on IBM mainframes also 2) but in general you should always use 8 so that all 64 bits of the 8 byte floating point values SAS uses for numbers are stored. For character variables prefix the number of bytes you want the variable to be able to hold with a $. Character variables can have a length from 1 to 32,767 bytes.
To define other characteristics of the variable you can also optionally add FORMAT, INFORMAT and LABEL statements (or the corresponding options of the ATTRIB statement).
If you do not define a new variable before first using it then SAS will be forced to GUESS how to define it based on how you first use it. It will default to numeric unless there is some indication in the usage that it should be character. It will default the length to 8 bytes for both numeric and character variables. If there is some indication about what length to use for the previously unknown character variable then it will use that length.
That is why there is a side effect of attaching a format to a variable that has not yet been defined. Attaching $4. to SEX2 when it has not be defined will cause SAS to define it as character with a length of 4 bytes. But if SEX2 had previously already been defined with a length of 6 bytes then attaching the $4. format specification to it just it to display only the first 4 of those 6 bytes. But it you try to set the length to $4 when it was already set to $6 SAS would generate a message that you cannot change the length of an existing character variable. (Note: You can change the length of an existing numeric variable because the length only has an impact when the data is written to the dataset. While the data step is running all numeric values use the full 8 bytes.)
Hello,
To define a new variable use a LENGTH statement (or the LENGTH= option of the ATTRIB statement).
[blueblue]: Are you saying we don't need to use format statement?
To define other characteristics of the variable you can also optionally add FORMAT, INFORMAT and LABEL statements (or the corresponding options of the ATTRIB statement).
[blueblue]: Are you saying we use format, informat and label statement only when we need to format our character/ date/ datetime filed?
Then what is the syntax for adding a new variable to a dataset in a data step?
Define length only?
or
Define length and format only?
I appreciate your resposne.
Thanks,
Blue Blue
You only need to use a FORMAT statement if you want to attach a format. Or to remove a previously attached format. Most variables do not need to have formats attached to them as SAS does a good job of display both character strings and numbers without being given special instructions.
Anything that references a previously unknown variable in a data step will force SAS to add the variable. There are some exceptions, like RETAIN, that will set the name and position (order) of the variable without forcing a decision about the type and length.
If that is not a LENGTH statement (that will explicitly set the type and length of the variable) then SAS will guess from the usage what type and length to use to define the variable. Some of those methods are precise and others not so much.
Attaching a numeric format will let SAS know to define it as numeric.
Attaching a character format will let SAS know to define it as character. Using simple built-in formats like $ or $CHAR and SAS will do a good job of guessing the length since the number of displayed bytes will match the number of stored bytes needed. But for user defined formats using the display width of the format might be the right length to use for storage of the variable or it might not. The proper length for the storage of the variable is based on the values stored in the variable, not the values that are displayed.
@GN0001 wrote:
Then what is the syntax for adding a new variable to a dataset in a data step?
Define length only?
or
Define length and format only?
Typically you would use an assignment statement, e.g.:
data want ;
set sashelp.class ;
age2=age**2 ;
NameUpcase=upcase(name) ;
run ;
When you use an assignment statement to create a new variable, SAS has rules for determining the type (numeric or character) and length of the variable.
Sometimes those rules result in a variable with a length that is different than you would like. This often happens with character variables when you assign them a literal value for example if you code:
data want ;
set sashelp.class ;
if Sex='M' then Gender='Male' ;
else gender='Female' ;
put gender= ;
run ;
SAS will determine that GENDER will be a character variable of length 4, because the first value the compiler sees for GENDER is a character string of length 4. Because you probably want a longer string to accommodate the six characters in 'Female' you can use the LENGTH statement to tell the compiler to create the GENDER variable as a character variable with length 6:
data want ;
set sashelp.class ;
length Gender $6 ;
if Sex='M' then Gender='Male' ;
else gender='Female' ;
put gender= ;
run ;
When working with numeric variables, it is rarely useful to use a length statement because doing so will reduce the precision of the numeric value (while saving storage space). Most often the default length of 8 for a numeric variable is acceptable. But for character variables, it is more common to explicitly set the length of a character variable to avoid the problem of SAS determining a length that is shorter or longer than you desire.
Can you describe your goal for the output dataset?
What variables do you want it to have?
Your code now is not using the RENAME statement. It's using an assignment statement to create a variable named RENAME.
I can't quite tell what you're trying to do. Usually code is clearer if you use the rename OPTION, like you are using the KEEP option.
Hello,
Thanks for your input.
Thanks,
Blue Blue
@GN0001 wrote:
How can I fix this?
Regards,
Blue Blue
Use your best source of knowledge, the SAS documentation:
Hi,
Here are two other suggestions based on Tom code
data abc (drop=_patient_stat);
set abc2 (Keep= firstname lastname patient_stat
rename=(patient_stat=_patient_stat));
length Product $10 PATIENT_STAT $25 sex2 $4;
product = 'ddd';
sex2 = ' ';
if _patient_stat = 'L' then PATIENT_STAT = 'LBC';
else if _patient_stat = 'U' then PATIENT_STAT = 'UBC';
run;
Suggestion 2
proc format; value $ pstat 'L'='LBC' 'U'='UBC'; run; data abc (drop=_patient_stat); set abc2 (keep= firstname lastname patient_stat rename=(patient_stat=_patient_stat)); length Product $10 PATIENT_STAT $25 sex2 $4; product = 'ddd'; sex2 = ' '; PATIENT_STAT = put(_patient_stat,$pstat25.); run;
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.