Hi all,
I guess the title is not informative. I have a list of words that I want to remove some strings from some character variables.
assume that I have the following data:
Header 1 |
---|
MARK SMITH MARCH |
JOHN END BROWN |
BILL VICE GREEN MARCH |
I want to remove the words (MARCH, END, and VICE) from the characters.
Normally, I shoud use a tranwrd function:
var2 = tranwrd(var1, 'MARCH',' ' );
var3 = tranwrd(var2, 'END',' ');
var4 = tranwrd(var3,'VICE',' ');
but, I feel that it is not an efficient way. Specially because the number of words that I want to remove are huge.
So, I am thinking of first defining some list:
list = ['MARCH', 'END', 'VICE']
and then some how write a loop in the list.
Can some one help me if there is an efficient way to do that?
thanks a lot
Here is one possibility:
data have;
informat header1 $80.;
input Header1 &;
cards;
BILL VICE GREEN MARCH
JOHN END BROWN
MARK SMITH MARCH
JOHN DOE
;
data want (drop=_:);
set have;
_patternID = prxparse('/VICE|MARCH|END/');
do until (_position eq 0);
call prxsubstr(_patternID, header1, _position, _length);
header1=catt(substr(header1,1,_position-1),
substr(header1,_position+_length));
end;
run;
Here is one possibility:
data have;
informat header1 $80.;
input Header1 &;
cards;
BILL VICE GREEN MARCH
JOHN END BROWN
MARK SMITH MARCH
JOHN DOE
;
data want (drop=_:);
set have;
_patternID = prxparse('/VICE|MARCH|END/');
do until (_position eq 0);
call prxsubstr(_patternID, header1, _position, _length);
header1=catt(substr(header1,1,_position-1),
substr(header1,_position+_length));
end;
run;
Alternatively, please try prxchange function
data have;
input text &$100.;
text=prxchange('s/MARCH|END|VICE//i',-1,text);
cards;
MARK SMITH MARCH
JOHN END BROWN
BILL VICE GREEN MARCH
;
Thanks,
Jag
Here's a way using temporary arrays, that loads the words in from a separate dataset.
Probably not as fast as PRX but easier to debug
data have;
informat header1 $80.;
input Header1 &;
cards;
BILL VICE GREEN MARCH
JOHN END BROWN
MARK SMITH MARCH
JOHN DOE
;
data word_search;
informat words_find $8.;
input words_find;
cards;
MARCH
END
VICE
;
run;
data want;
*load word list into temporary array;
array words(3) $ _temporary_ ;
if _n_=1 then do i=1 to 3;
set word_search;
words(i)=words_find;
end;
*search for words;
set have;
var=header1;
do i=1 to dim(words);
var=compbl(tranwrd(var, compress(words(i)), ' '));
end;
run;
data want(drop=list);
set have;
Header1_=Header1;
length list $30;
do list = 'MARCH', 'END', 'VICE';
Header1_ = tranwrd(strip(Header1_),strip(list),'');
end;
run;
Thanks a lot guys,
I got really awesome ideas.
Don't miss out on SAS Innovate - Register now for the FREE Livestream!
Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.