I am working on a dataset in which I have just transposed the data from wide to long. My next step is to delete a word. I want to get rid of the word "county" and remove the underscores. I was told the tranwrd and scan function can assist but I am not sure how to build that code. I was wondering if anyone has used these functions in this way before?
Attached is what my output looks like that I want to change and what my code looks like.
Thank you,
Since it's the word county at the start, SUBSTR is easier in this case.
If you leave the third parameter of the SUBSTR function it goes to the end by default. So starting it at 7, truncates out the word county. It may be 6...test it.
CountyName = substr(_name_, 7);
@Guerraje wrote:
I am working on a dataset in which I have just transposed the data from wide to long. My next step is to delete a word. I want to get rid of the word "county" and remove the underscores. I was told the tranwrd and scan function can assist but I am not sure how to build that code. I was wondering if anyone has used these functions in this way before?
Attached is what my output looks like that I want to change and what my code looks like.
Thank you,
When running that code, this is the error I get. The variable name is _Name_ should that be before the = and county be inside of the parenthesis?
You have to include the assignment statement in a data step.
What is the name of the dataset that has the variable _NAME_? What is the name of the dataset you want to create where the values of _NAME_ are changed? Where do you want to store the value of the new variables names? Back into the original _NAME_ variable? Somewhere else?
Perhaps:
data want;
set have;
if upcase(_name_)=: 'COUNTY' then new_name=substr(_name_,7);
else new_name=_name_;
run;
Perhaps if you explained the whole problem you would get a better answer. For example it sounds to me like you might want to actual change the NAME of some variables. Not just change the values stored in some character variable.
my dataset name is: coimpt.Insurance
so if I make a data step would like be correct? Because I get this error.
Data CountyNames;
Set coimpt.Insurance;
If upcase (_name_) = 'COUNTY' then Countynew = substr (_name_7);
else Countynew = _name_;
run;
I will want to remove the name "county" that is in front of all the county names in the output, and remove the underscores listed within the county names (I pasted what the output Im trying to change looks like above).
Posting text as text instead of photographs makes it much easier to review and respond to your issues.
You did not include the comma between the two arguments to the SUBSTR() function. In addition to the error SAS is reporting (that you do not have a enough arguments) it will also fail because you probably do not have a variable named _NAME_7. Probably you variable is named _NAME_.
After fixing the code, I no longer have errors, however, my output went back to what it was before. Wide instead of long and the word county is still in front of all the actually county names.
thank you,
Please show the whole program.
The latest snapshot you posted does not appear to have anything to do with the code you posted before. Where is the variable called _NAME_?
PROC TRANSPOSE DATA = coimpt.Insurance OUT= county;
run;
Data CountyNames;
Set coimpt.Insurance;
If upcase (_NAME_) = 'COUNTY' then Countynew = substr(_NAME_,7);
else Countynew = _NAME_;
run;
I first transpose and that is where the variable _NAME_ is observed. Now I am trying to get rid of the word "coutny" in front of all the actually county names (Adams, Alamosa.. so on). After running the code above that we have been using it, it changes my code back to running wide not long and the variable names change back to what you see in the last photo
You are using the wrong dataset in the set statement, you should use the result of proc transpose.
Thank you so much. That fixed my last problem. Now I have created another column. So I have 3 columns of all of the same information now. I am still unable to get rid of the word "county" in front of all of the county names. Thank you for your patience. My updated code after fixing the set statement looks like this:
PROC TRANSPOSE DATA = coimpt.Insurance OUT= county;
run;
Data CountyNames;
Set county;
If upcase (_NAME_) = 'COUNTY' then Countynew = substr(_NAME_,7);
else Countynew = _NAME_;
run;
You keep posting snippets of your code. This time you posted text as text instead of a picture of it. Next learn to use the Insert SAS code button on the menu bar to get a box that will preserve the formatting.
Your IF statement is testing if the name is exactly COUNTY instead of testing if it starts with COUNTY.
The colon modifier on the comparison operator will tell SAS to compare only up to the length of the shorter of the two strings.
data CountyNames;
set county;
if upcase(_NAME_) =: 'COUNTY' then Countynew = substr(_NAME_,7);
else Countynew = _NAME_;
run;
Spelling counts. You are using ='COUNTY' but the values you keep showing pictures of has 'county'. To a computer 'COUNTY' is not the same as 'county' as the letters are different. Spell the text you search for as it appears in your data.
@Guerraje wrote:
Thank you so much. That fixed my last problem. Now I have created another column. So I have 3 columns of all of the same information now. I am still unable to get rid of the word "county" in front of all of the county names. Thank you for your patience. My updated code after fixing the set statement looks like this:
PROC TRANSPOSE DATA = coimpt.Insurance OUT= county;
run;
Data CountyNames;
Set county;
If upcase (_NAME_) = 'COUNTY' then Countynew = substr(_NAME_,7);
else Countynew = _NAME_;
run;
When making "COUNTY" "county"
my output remains the same.
If the goal is to make a version of coimpt.Insurance with the names of the variables changed then it might just be easier to generate the code needed for a RENAME statement or RENAME= dataset option.
Here is a step by step way to do that.
First get a dataset with just the variable names. You could use PROC TRANSPOSE for that.
proc transpose data=coimpt.Insurance(obs=0) out=names;
var _all_;
run;
Note that the OBS=0 dataset option will make transpose just create a list of variables. The variable names will stored in a variable named _NAME_. (You could use the NAME= option of PROC TRANSPOSE to use another name for this variable.) The VAR statement with the _ALL_ variable list will insure that the character variables are included. The default for PROC TRANSPOSE is to only operate on the numeric variables.
Now use that list to generate a series of OLD=NEW rename pairs. We can use PROC SQL to generate that into a macro variable. Let's call the macro variable RENAME. Let's restrict it just the variables whose name starts with 'county'. The NLITERAL() function calls are just in case any of your variables do not follow standard SAS naming conventions, for example names that start with a number or have embedded spaces.
proc sql noprint;
select catx('=',nliteral(_name_),nliteral(substr(_name_,7)))
into :rename separated by ' '
from names
where _name_ like 'county%'
;
quit;
You can use code like this to see what the new macro variable has. Also check the automatic macro variable SQLOBS to see how many names were found that started with lowercase county.
%put &=sqlobs;
%put &=rename;
Now let's make a new dataset that has the data from the original dataset, but changes the names of those variables. For the purpose of this example let's call the new dataset WANT. We can use the RENAME statement in this data step.
data want;
set coimpt.Insurance ;
rename &rename;
run;
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.
Find more tutorials on the SAS Users YouTube channel.