I am cleaning raw data for matching purposes with another dataset. The K dataset requires a lot of reformatting, including moving the suffices from the FirstName variable to Suffix variable and removing them from the FirstName variable. This code works to move the suffices to a new variable, but will not remove "III" from the Firstname variable. This works for other suffices, but not III. Including SAS programming specific to this section of data only. SAMPLE IMPORTED DATA: ... input FirstName:$13.; datalines; RICHARD MICHAEL JR III ; /* ADD SUFFIX VARIABLE AND REMOVE FROM FIRST NAME */
/* REMOVES ALL 'I' n_lastName = COMPRESS(new_lastName,' III'); */
data K5;
set K4;
new_FirstName = FirstName;
length suffix $ 7;
do word= 'JR', 'SR', 'II', 'III', 'IV' ;
new_FirstName = tranwrd(' '||new_FirstName, ' '||strip(word)||' ', ' ');
new_FirstName = compbl(new_FirstName);
if findw(FirstName, 'JR')>0 then do;
suffix='JR';
end;
if findw(FirstName, 'SR')>0 then do;
suffix='SR';
end;
if findw(FirstName, 'II')>0 then do;
suffix='II';
end;
if findw(FirstName, 'III')>0 then do;
suffix='III';
end;
if findw(FirstName, 'IV')>0 then do;
suffix='IV';
end;
end;
drop word;
*newlastName newlName last_name first_name middle_name lastName firstName middleName;
run;
/* NOTE: III IS STILL PRESENT IN new_FirstName */ SAMPLE K5 OUTPUT FirstName new_FirstName suffix RICHARD RICHARD MICHAEL MICHAEL JR JR III III III /* MOVE FIRST NAME FROM MIDDLENAME VARIABLE */
/* need to drop iii */
data K6;
set K5;
if new_FirstName = "III" then do;
new_FirstName = "";
end;
If missing(new_FirstName) then new_FirstName=MiddleName;
run; SAMPLE K6 OUTPUT FirstName new_FirstName suffix RICHARD RICHARD MICHAEL MICHAEL JR WILLIAM JR III III III How do I successfully drop the "III" from the name variable?
... View more