- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
hi I need to search multiple strings from a text variable (large dataset of almost 180K lines). I was using:
vista=0;
if index(lowcase(studydescription),'cine') then vista=1;
if index(lowcase(studydescription),'3pl loc') then vista=1;
if index(lowcase(studydescription),'spiral') then vista=1;
but have more than 40 strings that would qualify for vista=1. These strings are buried within the field "studydescription" and not in any fixed position. I have not been able to use the scan function either. Any simple solution where i can search any of these strings in a single or a few lines of codes? thanks
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hello,
One way to do this is to 'build' a DATA step with as many IF statements that you need by using what is sometimes called a 'driver' data set that is just a vector of the strings that you are trying to find in your main data set. This example shows you 1) create the driver data set (but you can import it into SAS from excel or a CSV since you have many more strings) and 2) how to "build" a data step using the CALL EXECUTE routine (which I have to say is a fantastic tool in SAS 9.4). This routine is going to process one IF statement for each string that is in the driver data step. There are other IF statements there to build the beginning and the end of the data step.
data driver;
input var $;
cards;
A6
A4
TL
;
run;
data _null_;
set driver end=eof;
if _n_ = 1 then do;
call execute('data want;');
call execute('set sashelp.cars;');
end;
call execute('if index(Model,'||'strip(var)'||') > 0 then vista=1;');
if eof then do;
call execute('keep Model vista;');
call execute('run;');
end;
run;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Code like below should do the job.
It loads your list of search strings into a hash table for lookup and then for each row in your source table iterates over this hash table until there is a match or until there are no further search terms left.
data search_list;
infile datalines truncover;
input search_string :$10.;
datalines;
fre
ud
nal
;
data test;
if _n_=1 then
do;
if 0 then set search_list(keep=search_string);
dcl hash h1(dataset:'search_list', multidata:'y');
h1.defineKey('search_string');
h1.defineData('search_string');
h1.defineDone();
dcl hiter hh1('h1');
end;
set sashelp.class;
vista=0;
_rc = hh1.first();
do while (_rc = 0);
if find(name,search_string,'it')>0 then
do;
vista=1;
leave;
end;
_rc = hh1.next();
end;
drop search_string _rc;
run;
proc print data=test;
run;