BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Shradha1
Obsidian | Level 7

I have two datasets A and B. Table A contains a list of Keywords and Table B contains a list of Merchant Names.

I have to create a table which will have all those observations from table B where the Merchant Name in table B contains any of the Keywords present in table A.

Usually when I have a handful of keywords I do a wildcard search like this: 

data B1;
Set B (where=( merchant_name like "%keyword1%"
or merchant_name like "%keyword2%"));
RUN;

But because I have a long list of keywords in this case how do I approach this? Will "like/contains" work in this case?  Is this the correct way?

Proc sql;
create table B1 as
select * from B where merchant_name contains (select distinct Keyword from A);
quit;

Thanks,

Shradha

1 ACCEPTED SOLUTION

Accepted Solutions
Reeza
Super User

If you want to flag the records rather than filter them you can try this method.

 

/*This is an example of how to search through a list of terms and see if a field contains any of the values*/

*Make fake data to show example;
*terms to search for;
data terms; 
set sashelp.baseball (obs=5);
search_term = substr(team,1,3);
keep search_term;;
run;
    
*main data set that will be searched;
data test; 
set sashelp.baseball;
run;


/*General process to the solution*/
******************************************************************************************
1. Store the number of terms in a macro variable to assign the length of arrays
2. Load terms to search into a temporary array
3. Loop through for each word and search the terms
4. Exit loop if you find the term to help speed up the process
******************************************************************************************;



/*1*/
proc sql noprint;
select count(*) into :num_search_terms from terms;
quit;

%put &num_search_terms.;



data flagged;

*declare array;
array _search(&num_search_terms.) $ _temporary_;

/*2*/
*load array into memory;
   if _n_ = 1 then do j=1 to &num_search_terms.;
   set terms;
   _search(j) = search_term;
   end;
 
 set test;
 
 *set flag to 0 for initial start;
 flag = 0;

/*3*/
*loop through and craete flag;
do i=1 to &num_search_terms. while(flag=0); /*4*/
   if find(team, _search(i), 'it')>0 then flag=1;
end;

drop i j search_term ;

run;

View solution in original post

6 REPLIES 6
Oligolas
Barite | Level 11

Hi,

use 'IN' instead of contains

________________________

- Cheers -

Shradha1
Obsidian | Level 7
But the merchants names in dataset B may not be exactly the same as the keywords. For example the merchant name can be 'Amazon.com' but by keyword list may have 'Amazon', 'Amazon Pay' etc.
In this case I would want the observation 'Amazon.com' to be selected in the new table B1.
Tom
Super User Tom
Super User

Use the EXISTS functionality of SQL.

create table B1 as
  select * from B 
  where exists (select 1 from A where find(B.merchant_name,A.Keyword,'it'))
;
Shradha1
Obsidian | Level 7
This worked. Thanks!
Is there any way this can be incorporated within an Case When statement also?
Like 'Case when merchant_name contains (select distinct Keyword from A) then 1 else 0 end as flag'
Ksharp
Super User
Proc sql;
create table B1 as
select *
from B inner join A
on merchant_name contains strip( Keyword );
quit;
Reeza
Super User

If you want to flag the records rather than filter them you can try this method.

 

/*This is an example of how to search through a list of terms and see if a field contains any of the values*/

*Make fake data to show example;
*terms to search for;
data terms; 
set sashelp.baseball (obs=5);
search_term = substr(team,1,3);
keep search_term;;
run;
    
*main data set that will be searched;
data test; 
set sashelp.baseball;
run;


/*General process to the solution*/
******************************************************************************************
1. Store the number of terms in a macro variable to assign the length of arrays
2. Load terms to search into a temporary array
3. Loop through for each word and search the terms
4. Exit loop if you find the term to help speed up the process
******************************************************************************************;



/*1*/
proc sql noprint;
select count(*) into :num_search_terms from terms;
quit;

%put &num_search_terms.;



data flagged;

*declare array;
array _search(&num_search_terms.) $ _temporary_;

/*2*/
*load array into memory;
   if _n_ = 1 then do j=1 to &num_search_terms.;
   set terms;
   _search(j) = search_term;
   end;
 
 set test;
 
 *set flag to 0 for initial start;
 flag = 0;

/*3*/
*loop through and craete flag;
do i=1 to &num_search_terms. while(flag=0); /*4*/
   if find(team, _search(i), 'it')>0 then flag=1;
end;

drop i j search_term ;

run;
How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 6 replies
  • 4183 views
  • 2 likes
  • 5 in conversation