Hi,
I have a field that contains a string in many different formats. What I want to do is remove the last character in a string. Please see below:
Have
data one;
input name $1-24;
cards;
BANGE RICHARD M
AARON J BOWLER
;
run;
data two;
set one;
length new_name $24;
new_name=substr(name, 1, length(name)-1);
run;
proc print;
run;
Want
BANGE RICHARD
AARON J BOWLER
The code above works for the top record bit incorrecly removes a character from Bowler. Is there a simpe way to remove the last character where its not attached to another string but ignore full words?
data two;
set one;
length new_name $24;
new_name=ifc(char(name,lengthn(name)-1)=" ",substr(name,1,lengthn(name)-1),name);
new_name=ifc(char(new_name,2)=" ",substr(new_name,3),new_name);
run;
Well an if could sort that:
data one;
input name $1-24;
cards;
BANGE RICHARD M
AARON J BOWLER
;
run;
data two;
set one;
length new_name $24;
new_name=ifc(char(name,lengthn(name)-1)=" ",substr(name,1,lengthn(name)-1),name);
run;
You can try using a coditional statement :-
data one;
input name $1-24;
cards;
BANGE RICHARD M
AARON J BOWLER
;
run;
data two;
set one;
length new_name $24;
if length(scan(reverse(name),1)) = 1 then
new_name = substr(name, 1, length(name)-1);
else new_name = name;
run;
proc print;
run;
do let me know for your suggestions.
HI,
I like this suggestion also. How would you cleanse the name "A IAN GILES" to "IAN GILES" in the same syntax? Thanks
Thanks that seems to work fine. Just going through this poor quality data. Would you know how to deal with records like the below as part of the same syntax:
Have
A IAN GILES
Want
IAN GILES
I should be then good to go to split the names into forenames and surnames and then create a matchkey to match data in different files. Many thanks for your help.
Hi,
according to me you can again use the same set of conditional statements. with a few changes
data one;
input name $1-24;
cards;
BANGE RICHARD M
AARON J BOWLER
;
run;
data two;
set one;
length new_name $24;
if length(scan(name,1)) = 1 then
new_name = substr(name, 3);
else new_name = name;
run;
proc print;
run;
guess that would work. 🙂
data two;
set one;
length new_name $24;
new_name=ifc(char(name,lengthn(name)-1)=" ",substr(name,1,lengthn(name)-1),name);
new_name=ifc(char(new_name,2)=" ",substr(new_name,3),new_name);
run;
Alternatively you could try the perl regular expression which will achieve the expected output from a single line code
data one;
input name $1-24;
newname=prxchange('s/^\w\s|\w$/ /',-1,strip(name));
cards;
BANGE RICHARD M
AARON J BOWLER
A IAN GILES
;
run;
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.