Hi all—
I’m having trouble with the TRANWRD function and variable length. I have a large data set that I continually append data to. In the data set the variable AGENCY has a length of 5. I need to fix the value of ‘ST JO’ contained in the AGENCY variable and swap it out for ‘ST JOHN'. When I add in the LENGTH statement for AGENCY the variable length does change but the value stays as 'ST JO' and does not change/correct to 'ST JOHN'.
data want
length AGENCY $8;
set have
AGENCY =TRANWRD(AGENCY,'ST JO','ST JOHN') ;
run;
Thoughts?
Thanks!
Works fine for me. Are you sure that is what is in your data? Perhaps the space is not really a space?
data have;
input agency $5.;
cards;
ST JO
run;
data want ;
length AGENCY $8;
set have ;
AGENCY =TRANWRD(AGENCY,'ST JO','ST JOHN') ;
put agency=;
run;
Hi Tom--
Thanks for your help.
What is this last step do:
put agency=;
Thanks!
put agency=;
write the value of agency to log file.
Hi Tom and Linlin--
The issue still seems to happen. But it happens when I combine the data. I use a simple union ststment to do it:
proc sql;
create table want as
select *
from have
union
select *
from have;
quit;
It looks like the variable lengths always stay the same But the Format and Informat values change. How do I keep them at the same value everytime?
Thanks.
In general it is a BAD idea to attach a FORMAT to a character variable for just the reason you state. If you have a variable with length of 8 and it has a format of $5 attached to it then when you try to display the value it is truncated.
Unfortunately SAS has a nasty habit of always attaching formats to character variables when you pull the data from an external database system such as Oracle.
I find the easiest thing to due is to use a FORMAT statement in a data step to remove the formats.
data want;
set want;
format _character_ ;
run;
Tom Thanks! last question, I just want to make sure I'm understanding you.
does _character_ means it applies to all the character variables in the data set. What if I have numeric variables in the data set as well?
Right. There are many ways to specify lists of variables. _character_ , _numeric_ and _all_ are useful ones. You can also use ranges of variables with numeric suffix (VAR1 - VAR20) or based on the order in the dataset ( ID -- ADDRESS).
Numeric variable formats are normally not as big a problem. In particular DATE, TIME and DATETIME formats are needed so that the values appear properly.
The other tricky thing about the format statement is that it works to remove the formats because it does not specify a format after the variable list as you would normally do when using the format statement to attach formats.
hi ... removing formats is one of the tasks that you can also accomplish with PROC DATASETS rather than a data step
if the data set is large, you might save some time with ...
proc datasets lib=<whatever> nolist;
modify want;
format _character_;
quit;
Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.
Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.