I am trying to merge/search two dataset where Searchterm from dataset 2 should be looked up in var1 for dataset 1. If any one of the search is indexed or found then it flags Y. However, it should remove the flag if any term from var1 matches anything in Excl1 to excl4 variable from dataset 2.
You can see the desired output.
Can any one help me with this ?
Dataset 1
var1 | Client |
Bank of America | 423 |
Wells Fargo | 232 |
Chase | 34 |
American Bank | 24 |
Bank of Japan | 3453 |
European Bank | 23 |
Bank of London | 623 |
Waco Bank | 553 |
Brazilian Bank | 403.6 |
West California Bank | 226.4 |
Citigroup Japan Bank | 49.2 |
PNC Bank | 128 |
Charles Schwab Corporation | 305.2 |
Bank of New York Mellon | 482.4 |
Capital One Bank | 659.6 |
BB&T Bank | 836.8 |
State Street Corporation Bank | 1014 |
Discover Financial Bank | 1191.2 |
SunTrust Bank | 1368.4 |
M&T Bank | 1545.6 |
Fifth Third Bank | 1722.8 |
Dataset2
SearchTerm | broad | excl1 | excl2 | excl3 | excl4 |
Bank | Y | Japan | London | Brazilian | European |
America | |||||
Wells | |||||
Chase | |||||
Waco |
Desired Output:
var1 | Client | Flagged |
Bank of America | 423 | Y |
Wells Fargo | 232 | Y |
Chase | 34 | Y |
American Bank | 24 | Y |
Bank of Japan | 3453 | N |
European Bank | 23 | N |
Bank of London | 623 | N |
Waco Bank | 553 | Y |
Brazilian Bank | 403.6 | N |
West California Bank | 226.4 | Y |
Citigroup Japan Bank | 49.2 | N |
PNC Bank | 128 | Y |
Charles Schwab Corporation | 305.2 | N |
Bank of New York Mellon | 482.4 | Y |
Capital One Bank | 659.6 | Y |
BB&T Bank | 836.8 | Y |
State Street Corporation Bank | 1014 | Y |
Discover Financial Bank | 1191.2 | Y |
SunTrust Bank | 1368.4 | Y |
M&T Bank | 1545.6 | Y |
Fifth Third Bank | 1722.8 | Y |
We could get the flag derived without the merge, but by simple search. For the same we need to have the macro variables created for the words which when found the flag='y' and the other words which when found the flag='n'. Hence using the below data i created two macro variables like term and noterm.
Now use the term to flag='y' and noterm to flag='n'
data have2;
infile cards missover;
input SearchTerm$ broad$ excl1$ excl2$ excl3$ excl4$;
cards;
Bank Y Japan London Brazilian European
America
Wells
Chase
Waco
;
proc sql noprint;
select distinct searchterm into: term separated by '|' from have2;
select catx('|',excl1,excl2,excl3,excl4) into: noterm from have2 where broad='Y';
quit;
%put &term ¬erm;
data have1;
input var1&$30. Client;
if prxmatch("m/&term/i",var1) then flag='Y';
if prxmatch("m/¬erm/i",var1) then flag='N';
cards;
Bank of America 423
Wells Fargo 232
Chase 34
American Bank 24
Bank of Japan 3453
European Bank 23
Bank of London 623
Waco Bank 553
Brazilian Bank 403.6
West California Bank 226.4
Citigroup Japan Bank 49.2
PNC Bank 128
Charles Schwab Corporation 305.2
Bank of New York Mellon 482.4
Capital One Bank 659.6
BB&T Bank 836.8
State Street Corporation Bank 1014
Discover Financial Bank 1191.2
SunTrust Bank 1368.4
M&T Bank 1545.6
Fifth Third Bank 1722.8
;
@Jagadishkatam: I like your solution, but it doesn't match the last string in ¬erm. and I have no idea why it doesn't!
I've posted your code as a question on SAS-L to increase the chance of finding out why it doesn't work for European.
@katamul: Why do you want to set a flag for Charles Schwab Corporation? It doesn't meet any of your conditions.
Art, CEO, AnalystFinder.com
Got the answer: In the final data step you have to use the macro trim function. i.e.
if prxmatch("m/%trim(&term)/i",var1) then flag='Y'; if prxmatch("m/%trim(¬erm)/i",var1) then flag='N';
Art, CEO, AnalystFinder.com
Better add '\b' to specify word broad .
select cats('\b',searchterm,'\b') into : term separated by '|'
Also, I just learned something new. SQL has a trimmed option that is the default when using separated by. Thus, the SQL call should be:
proc sql noprint; select distinct searchterm into :term separated by '|' from dataset2 ; select catx('|',excl1,excl2,excl3,excl4) into: noterm trimmed from dataset2 where broad='Y' ; quit; data want; set dataset1; if prxmatch("m/&term/i",var1) then flag='Y'; if prxmatch("m/¬erm/i",var1) then flag='N'; run;
Art, CEO, AnalystFinder.com
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.