Hi all,
I guess the title is not informative. I have a list of words that I want to remove some strings from some character variables.
assume that I have the following data:
Header 1 |
---|
MARK SMITH MARCH |
JOHN END BROWN |
BILL VICE GREEN MARCH |
I want to remove the words (MARCH, END, and VICE) from the characters.
Normally, I shoud use a tranwrd function:
var2 = tranwrd(var1, 'MARCH',' ' );
var3 = tranwrd(var2, 'END',' ');
var4 = tranwrd(var3,'VICE',' ');
but, I feel that it is not an efficient way. Specially because the number of words that I want to remove are huge.
So, I am thinking of first defining some list:
list = ['MARCH', 'END', 'VICE']
and then some how write a loop in the list.
Can some one help me if there is an efficient way to do that?
thanks a lot
Here is one possibility:
data have;
informat header1 $80.;
input Header1 &;
cards;
BILL VICE GREEN MARCH
JOHN END BROWN
MARK SMITH MARCH
JOHN DOE
;
data want (drop=_:);
set have;
_patternID = prxparse('/VICE|MARCH|END/');
do until (_position eq 0);
call prxsubstr(_patternID, header1, _position, _length);
header1=catt(substr(header1,1,_position-1),
substr(header1,_position+_length));
end;
run;
Here is one possibility:
data have;
informat header1 $80.;
input Header1 &;
cards;
BILL VICE GREEN MARCH
JOHN END BROWN
MARK SMITH MARCH
JOHN DOE
;
data want (drop=_:);
set have;
_patternID = prxparse('/VICE|MARCH|END/');
do until (_position eq 0);
call prxsubstr(_patternID, header1, _position, _length);
header1=catt(substr(header1,1,_position-1),
substr(header1,_position+_length));
end;
run;
Alternatively, please try prxchange function
data have;
input text &$100.;
text=prxchange('s/MARCH|END|VICE//i',-1,text);
cards;
MARK SMITH MARCH
JOHN END BROWN
BILL VICE GREEN MARCH
;
Thanks,
Jag
Here's a way using temporary arrays, that loads the words in from a separate dataset.
Probably not as fast as PRX but easier to debug
data have;
informat header1 $80.;
input Header1 &;
cards;
BILL VICE GREEN MARCH
JOHN END BROWN
MARK SMITH MARCH
JOHN DOE
;
data word_search;
informat words_find $8.;
input words_find;
cards;
MARCH
END
VICE
;
run;
data want;
*load word list into temporary array;
array words(3) $ _temporary_ ;
if _n_=1 then do i=1 to 3;
set word_search;
words(i)=words_find;
end;
*search for words;
set have;
var=header1;
do i=1 to dim(words);
var=compbl(tranwrd(var, compress(words(i)), ' '));
end;
run;
data want(drop=list);
set have;
Header1_=Header1;
length list $30;
do list = 'MARCH', 'END', 'VICE';
Header1_ = tranwrd(strip(Header1_),strip(list),'');
end;
run;
Thanks a lot guys,
I got really awesome ideas.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.