BookmarkSubscribeRSS Feed
katamul
Calcite | Level 5

I am trying to merge/search two dataset where Searchterm from dataset 2 should be looked up in var1 for dataset 1. If any one of the search is indexed or found then it flags Y. However, it should remove the flag if any term from var1 matches anything in Excl1 to excl4 variable from dataset 2. 

 

You can see the desired output. 

 

Can any one help me with this ?

 

Dataset 1

var1Client
Bank of America423
Wells Fargo232
Chase34
American Bank24
Bank of Japan3453
European Bank23
Bank of London623
Waco Bank553
Brazilian Bank403.6
West California Bank226.4
Citigroup Japan Bank49.2
PNC Bank128
Charles Schwab Corporation305.2
Bank of New York Mellon482.4
Capital One Bank659.6
BB&T Bank836.8
State Street Corporation Bank1014
Discover Financial Bank1191.2
SunTrust Bank1368.4
M&T Bank1545.6
Fifth Third Bank1722.8

 

Dataset2

SearchTermbroadexcl1excl2excl3excl4
BankYJapanLondonBrazilianEuropean
America     
Wells     
Chase     
Waco     

 

Desired Output:

var1ClientFlagged
Bank of America423Y
Wells Fargo232Y
Chase34Y
American Bank24Y
Bank of Japan3453N
European Bank23N
Bank of London623N
Waco Bank553Y
Brazilian Bank403.6N
West California Bank226.4Y
Citigroup Japan Bank49.2N
PNC Bank128Y
Charles Schwab Corporation305.2N
Bank of New York Mellon482.4Y
Capital One Bank659.6Y
BB&T Bank836.8Y
State Street Corporation Bank1014Y
Discover Financial Bank1191.2Y
SunTrust Bank1368.4Y
M&T Bank1545.6Y
Fifth Third Bank1722.8Y
6 REPLIES 6
Jagadishkatam
Amethyst | Level 16

We could get the flag derived without the merge, but by simple search. For the same we need to have the macro variables created for the words which when found the flag='y' and the other words which when found the flag='n'. Hence using the below data i created two macro variables like term and noterm.

 

Now use the term to flag='y' and noterm to flag='n'

 

data have2;
infile cards missover;
input SearchTerm$	broad$	excl1$	excl2$	excl3$	excl4$;
cards;
Bank	Y	Japan	London	Brazilian	European
America	 	 	 	 	 
Wells	 	 	 	 	 
Chase	 	 	 	 	 
Waco
;


proc sql noprint;
select distinct searchterm into: term separated by '|' from have2;
select catx('|',excl1,excl2,excl3,excl4) into: noterm from have2 where broad='Y';
quit;

%put &term &noterm;

data have1;
input var1&$30.	Client;
if prxmatch("m/&term/i",var1) then flag='Y';
if prxmatch("m/&noterm/i",var1) then flag='N';
cards;
Bank of America			423
Wells Fargo		232
Chase			34
American Bank	24
Bank of Japan	3453
European Bank	23
Bank of London	623
Waco Bank			553
Brazilian Bank			403.6
West California Bank	226.4
Citigroup Japan Bank	49.2
PNC Bank	128
Charles Schwab Corporation	305.2
Bank of New York Mellon		482.4
Capital One Bank			659.6
BB&T Bank						836.8
State Street Corporation Bank	1014
Discover Financial Bank			1191.2
SunTrust Bank					1368.4
M&T Bank						1545.6
Fifth Third Bank				1722.8
;
Thanks,
Jag
art297
Opal | Level 21

@Jagadishkatam: I like your solution, but it doesn't match the last string in &noterm. and I have no idea why it doesn't!

 

I've posted your code as a question on SAS-L to increase the chance of finding out why it doesn't work for European.

 

@katamul: Why do you want to set a flag for Charles Schwab Corporation? It doesn't meet any of your conditions.

 

Art, CEO, AnalystFinder.com

 

art297
Opal | Level 21

Got the answer: In the final data step you have to use the macro trim function. i.e.

 

if prxmatch("m/%trim(&term)/i",var1) then flag='Y';
if prxmatch("m/%trim(&noterm)/i",var1) then flag='N';

Art, CEO, AnalystFinder.com

Ksharp
Super User

Better add '\b'  to specify word broad .

 

select cats('\b',searchterm,'\b') into : term separated by '|'

Jagadishkatam
Amethyst | Level 16
Thank you Art and Ksharp for the suggestions
Thanks,
Jag
art297
Opal | Level 21

Also, I just learned something new. SQL has a trimmed option that is the default when using separated by. Thus, the SQL call should be:

proc sql noprint;
  select distinct searchterm
    into :term separated by '|'
      from dataset2
  ;
  select catx('|',excl1,excl2,excl3,excl4)
    into: noterm trimmed
      from dataset2
        where broad='Y'
  ;
quit;

data want;
  set dataset1;
  if prxmatch("m/&term/i",var1) then flag='Y';
  if prxmatch("m/&noterm/i",var1) then flag='N';
run;

Art, CEO, AnalystFinder.com

 

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 6 replies
  • 1851 views
  • 1 like
  • 4 in conversation