I have two datasets A and B. Table A contains a list of Keywords and Table B contains a list of Merchant Names.
I have to create a table which will have all those observations from table B where the Merchant Name in table B contains any of the Keywords present in table A.
Usually when I have a handful of keywords I do a wildcard search like this:
data B1;
Set B (where=( merchant_name like "%keyword1%"
or merchant_name like "%keyword2%"));
RUN;
But because I have a long list of keywords in this case how do I approach this? Will "like/contains" work in this case? Is this the correct way?
Proc sql;
create table B1 as
select * from B where merchant_name contains (select distinct Keyword from A);
quit;
Thanks,
Shradha
If you want to flag the records rather than filter them you can try this method.
/*This is an example of how to search through a list of terms and see if a field contains any of the values*/
*Make fake data to show example;
*terms to search for;
data terms;
set sashelp.baseball (obs=5);
search_term = substr(team,1,3);
keep search_term;;
run;
*main data set that will be searched;
data test;
set sashelp.baseball;
run;
/*General process to the solution*/
******************************************************************************************
1. Store the number of terms in a macro variable to assign the length of arrays
2. Load terms to search into a temporary array
3. Loop through for each word and search the terms
4. Exit loop if you find the term to help speed up the process
******************************************************************************************;
/*1*/
proc sql noprint;
select count(*) into :num_search_terms from terms;
quit;
%put &num_search_terms.;
data flagged;
*declare array;
array _search(&num_search_terms.) $ _temporary_;
/*2*/
*load array into memory;
if _n_ = 1 then do j=1 to &num_search_terms.;
set terms;
_search(j) = search_term;
end;
set test;
*set flag to 0 for initial start;
flag = 0;
/*3*/
*loop through and craete flag;
do i=1 to &num_search_terms. while(flag=0); /*4*/
if find(team, _search(i), 'it')>0 then flag=1;
end;
drop i j search_term ;
run;
Hi,
use 'IN' instead of contains
- Cheers -
Use the EXISTS functionality of SQL.
create table B1 as
select * from B
where exists (select 1 from A where find(B.merchant_name,A.Keyword,'it'))
;
If you want to flag the records rather than filter them you can try this method.
/*This is an example of how to search through a list of terms and see if a field contains any of the values*/
*Make fake data to show example;
*terms to search for;
data terms;
set sashelp.baseball (obs=5);
search_term = substr(team,1,3);
keep search_term;;
run;
*main data set that will be searched;
data test;
set sashelp.baseball;
run;
/*General process to the solution*/
******************************************************************************************
1. Store the number of terms in a macro variable to assign the length of arrays
2. Load terms to search into a temporary array
3. Loop through for each word and search the terms
4. Exit loop if you find the term to help speed up the process
******************************************************************************************;
/*1*/
proc sql noprint;
select count(*) into :num_search_terms from terms;
quit;
%put &num_search_terms.;
data flagged;
*declare array;
array _search(&num_search_terms.) $ _temporary_;
/*2*/
*load array into memory;
if _n_ = 1 then do j=1 to &num_search_terms.;
set terms;
_search(j) = search_term;
end;
set test;
*set flag to 0 for initial start;
flag = 0;
/*3*/
*loop through and craete flag;
do i=1 to &num_search_terms. while(flag=0); /*4*/
if find(team, _search(i), 'it')>0 then flag=1;
end;
drop i j search_term ;
run;
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.