BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Shradha1
Obsidian | Level 7

I have two datasets A and B. Table A contains a list of Keywords and Table B contains a list of Merchant Names.

I have to create a table which will have all those observations from table B where the Merchant Name in table B contains any of the Keywords present in table A.

Usually when I have a handful of keywords I do a wildcard search like this: 

data B1;
Set B (where=( merchant_name like "%keyword1%"
or merchant_name like "%keyword2%"));
RUN;

But because I have a long list of keywords in this case how do I approach this? Will "like/contains" work in this case?  Is this the correct way?

Proc sql;
create table B1 as
select * from B where merchant_name contains (select distinct Keyword from A);
quit;

Thanks,

Shradha

1 ACCEPTED SOLUTION

Accepted Solutions
Reeza
Super User

If you want to flag the records rather than filter them you can try this method.

 

/*This is an example of how to search through a list of terms and see if a field contains any of the values*/

*Make fake data to show example;
*terms to search for;
data terms; 
set sashelp.baseball (obs=5);
search_term = substr(team,1,3);
keep search_term;;
run;
    
*main data set that will be searched;
data test; 
set sashelp.baseball;
run;


/*General process to the solution*/
******************************************************************************************
1. Store the number of terms in a macro variable to assign the length of arrays
2. Load terms to search into a temporary array
3. Loop through for each word and search the terms
4. Exit loop if you find the term to help speed up the process
******************************************************************************************;



/*1*/
proc sql noprint;
select count(*) into :num_search_terms from terms;
quit;

%put &num_search_terms.;



data flagged;

*declare array;
array _search(&num_search_terms.) $ _temporary_;

/*2*/
*load array into memory;
   if _n_ = 1 then do j=1 to &num_search_terms.;
   set terms;
   _search(j) = search_term;
   end;
 
 set test;
 
 *set flag to 0 for initial start;
 flag = 0;

/*3*/
*loop through and craete flag;
do i=1 to &num_search_terms. while(flag=0); /*4*/
   if find(team, _search(i), 'it')>0 then flag=1;
end;

drop i j search_term ;

run;

View solution in original post

6 REPLIES 6
Oligolas
Barite | Level 11

Hi,

use 'IN' instead of contains

________________________

- Cheers -

Shradha1
Obsidian | Level 7
But the merchants names in dataset B may not be exactly the same as the keywords. For example the merchant name can be 'Amazon.com' but by keyword list may have 'Amazon', 'Amazon Pay' etc.
In this case I would want the observation 'Amazon.com' to be selected in the new table B1.
Tom
Super User Tom
Super User

Use the EXISTS functionality of SQL.

create table B1 as
  select * from B 
  where exists (select 1 from A where find(B.merchant_name,A.Keyword,'it'))
;
Shradha1
Obsidian | Level 7
This worked. Thanks!
Is there any way this can be incorporated within an Case When statement also?
Like 'Case when merchant_name contains (select distinct Keyword from A) then 1 else 0 end as flag'
Ksharp
Super User
Proc sql;
create table B1 as
select *
from B inner join A
on merchant_name contains strip( Keyword );
quit;
Reeza
Super User

If you want to flag the records rather than filter them you can try this method.

 

/*This is an example of how to search through a list of terms and see if a field contains any of the values*/

*Make fake data to show example;
*terms to search for;
data terms; 
set sashelp.baseball (obs=5);
search_term = substr(team,1,3);
keep search_term;;
run;
    
*main data set that will be searched;
data test; 
set sashelp.baseball;
run;


/*General process to the solution*/
******************************************************************************************
1. Store the number of terms in a macro variable to assign the length of arrays
2. Load terms to search into a temporary array
3. Loop through for each word and search the terms
4. Exit loop if you find the term to help speed up the process
******************************************************************************************;



/*1*/
proc sql noprint;
select count(*) into :num_search_terms from terms;
quit;

%put &num_search_terms.;



data flagged;

*declare array;
array _search(&num_search_terms.) $ _temporary_;

/*2*/
*load array into memory;
   if _n_ = 1 then do j=1 to &num_search_terms.;
   set terms;
   _search(j) = search_term;
   end;
 
 set test;
 
 *set flag to 0 for initial start;
 flag = 0;

/*3*/
*loop through and craete flag;
do i=1 to &num_search_terms. while(flag=0); /*4*/
   if find(team, _search(i), 'it')>0 then flag=1;
end;

drop i j search_term ;

run;

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!

Submit your idea!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 6 replies
  • 2438 views
  • 2 likes
  • 5 in conversation