<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Finding Strings (Exact or Partial Match) in a Line in SAS Data Management</title>
    <link>https://communities.sas.com/t5/SAS-Data-Management/Finding-Strings-Exact-or-Partial-Match-in-a-Line/m-p/613776#M18585</link>
    <description>&lt;P&gt;I would probably use the prxmatch function, but the problem doesn't seem well defined to me.&amp;nbsp; What do you mean by "partial match"?&amp;nbsp; If "Hasan" matches "Hassan", why wouldn't "as" also match?&amp;nbsp; Or "sanitary"?&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;One of the spelling difference functions might be useful, but would probably be very expensive.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Maybe one of the Dataflux products would do this if you have it licensed.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
    <pubDate>Tue, 24 Dec 2019 18:45:22 GMT</pubDate>
    <dc:creator>JackHamilton</dc:creator>
    <dc:date>2019-12-24T18:45:22Z</dc:date>
    <item>
      <title>Finding Strings (Exact or Partial Match) in a Line</title>
      <link>https://communities.sas.com/t5/SAS-Data-Management/Finding-Strings-Exact-or-Partial-Match-in-a-Line/m-p/613762#M18584</link>
      <description>&lt;P&gt;I have free text Narrative fields. I have to scan multiple keywords in this Free text field and find out strings (Partially or Exact Matched).&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;For Example:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;Narrative Field:&amp;nbsp;&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;&lt;FONT color="#0000FF"&gt;SUPPLY OF MANPOWER AS PER PROFORMA INVOICE DATED 14.01.2019&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT color="#0000FF"&gt;1. SIGNED COMMERCIAL INVOICE(S) IN 1 ORIGINAL AND 2 COPIES&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT color="#0000FF"&gt;SHOWING DATE OF SUPPLY OF MANPOWER NOT LATER THAN 15.05.2019 AND&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT color="#0000FF"&gt;DULY COUNTERSIGNED BY APPLICANTS AUTHORIZED SIGNATORY AND TO BE&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT color="#0000FF"&gt;AUTHENTICATED BY &lt;STRONG&gt;IRAN NATIONAL BANK&lt;/STRONG&gt;, &lt;STRONG&gt;ERAN&lt;/STRONG&gt; TRADE FINANCE DEPARTMENT&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT color="#0000FF"&gt;PRIOR PRESENTATION OF DOCS FOR NEGOTIATION.&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT color="#0000FF"&gt;IN THE ABSENCE OF DATE OF SUPPLY OF MANPOWER, THE DATE SHOWN ON&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT color="#0000FF"&gt;COMMERCIAL INVOICE WILL BE CONSIDERED AS THE SUPPLY DATE for &lt;STRONG&gt;DAWOOD HASSAN&lt;/STRONG&gt;.&lt;/FONT&gt;&lt;/P&gt;&lt;P&gt;&lt;FONT color="#0000FF"&gt;USD 120/- OR EQUIVALENT IN THE L/C CURRENCY AND RELATED&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT color="#0000FF"&gt;CHARGES SHOULD BE DEDUCTED FROM THE PAYMENT FOR EACH PRESENTATION by &lt;STRONG&gt;DAWOOD HASAN&lt;/STRONG&gt;&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT color="#0000FF"&gt;OF DISCREPANT DOCUMENTS UNDER THIS CREDIT, NOT WITHSTANDING ANY&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT color="#0000FF"&gt;INSTRUCTION TO THE CONTRARY, THIS CHARGE SHALL BE FOR THE ACCOUNT&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT color="#0000FF"&gt;OF BENEFICIARY&lt;/FONT&gt;&lt;/P&gt;&lt;P&gt;&lt;FONT color="#0000FF"&gt;2. BENEFICIARYS A/C NO.: 202-577688-001-0010-000&amp;nbsp; BIC: &lt;STRONG&gt;PIBPBG2L&lt;/STRONG&gt;&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT color="#0000FF"&gt;APPLICANT ACCOUNT. ALL OTHER&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT color="#0000FF"&gt;CHARGES INCLUDING REIMBURSEMENT AND&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT color="#0000FF"&gt;SWIFT PAYMENTS RELATED CHARGES ARE&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT color="#0000FF"&gt;FOR BENEFICIARY ACCOUNT in &lt;STRONG&gt;SYRIA&lt;/STRONG&gt;&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT color="#0000FF"&gt;WITHOUT DESPATCH FULL SET OF PRESENTED / NEGOTIATED DOCUMENTS IN ONE LOT&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT color="#0000FF"&gt;BY COURIER TO: QATAR NATIONAL BANK, MAIN OFFICE, GRAND HAMAD&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT color="#0000FF"&gt;STREET, TRADE FINANCE DEPARTMENT, IMPORTS SECTION, P.O. BOX 1000,&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT color="#0000FF"&gt;DOHA, QATAR.&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT color="#0000FF"&gt;++UPON RECEIPT OF CREDIT COMPLYING DOCUMENTS &lt;STRONG&gt;OSMA BIN LADEN&lt;/STRONG&gt; PAYMENT SHALL BE&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT color="#0000FF"&gt;EFFECTED BY US AS PER PRESENTING BANKS INSTRUCTION.&lt;/FONT&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I have to scan above Narrative field and find out list of keywords given below&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;PIBPBG2L&lt;/P&gt;&lt;P&gt;OSAMA BIN LADEN&lt;/P&gt;&lt;P&gt;DAWOOD HASSAN&lt;/P&gt;&lt;P&gt;SYRIA&lt;/P&gt;&lt;P&gt;IRAN&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;The scanning should give Partially or Exact matched strings in above Narrative field. Here, the output will be:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Matched strings for PIBPBG2L: &lt;STRONG&gt;&lt;FONT color="#0000FF"&gt;PIBPBG2L &lt;/FONT&gt;&lt;/STRONG&gt;&lt;FONT color="#0000FF"&gt;(Exact Match)&lt;/FONT&gt;&lt;/P&gt;&lt;P&gt;Matched strings for OSAMA BIN LADEN: &lt;STRONG&gt;&lt;FONT color="#0000FF"&gt;OSMA BIN LADEN&lt;/FONT&gt;&lt;/STRONG&gt;&lt;FONT color="#0000FF"&gt; (Partial Match)&lt;/FONT&gt;&lt;/P&gt;&lt;P&gt;Matched strings for DAWOOD HASSAN: &lt;FONT color="#0000FF"&gt;&lt;STRONG&gt;DAWOOD HASSAN&amp;nbsp;&lt;/STRONG&gt;(Exact Match) &lt;STRONG&gt;&amp;amp; DAWOOD HASAN&lt;/STRONG&gt; (Partial Match)&lt;/FONT&gt;&lt;/P&gt;&lt;P&gt;Matched strings for SYRIA: &lt;FONT color="#0000FF"&gt;&lt;STRONG&gt;SYRIA &lt;/STRONG&gt;(Exact Match)&lt;/FONT&gt;&lt;/P&gt;&lt;P&gt;Matched strings for IRAN: &lt;FONT color="#0000FF"&gt;&lt;STRONG&gt;IRAN&lt;/STRONG&gt;&amp;nbsp;(Exact Match)&lt;STRONG&gt; &amp;amp; ERAN &lt;/STRONG&gt;(Partial Match)&lt;/FONT&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Please note, I have to scan more ~1 Million such keywords in Real Time. So, please suggest for approach which could be faster. I am not using Like Wise match as this will be very slow for 1 Million records, also it will show exact matches only. To give you further details, I have to scan complete World Check Names in above Narrative field with Exact or Partial Matching.&lt;/P&gt;</description>
      <pubDate>Tue, 24 Dec 2019 17:24:32 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Data-Management/Finding-Strings-Exact-or-Partial-Match-in-a-Line/m-p/613762#M18584</guid>
      <dc:creator>manishiiita</dc:creator>
      <dc:date>2019-12-24T17:24:32Z</dc:date>
    </item>
    <item>
      <title>Re: Finding Strings (Exact or Partial Match) in a Line</title>
      <link>https://communities.sas.com/t5/SAS-Data-Management/Finding-Strings-Exact-or-Partial-Match-in-a-Line/m-p/613776#M18585</link>
      <description>&lt;P&gt;I would probably use the prxmatch function, but the problem doesn't seem well defined to me.&amp;nbsp; What do you mean by "partial match"?&amp;nbsp; If "Hasan" matches "Hassan", why wouldn't "as" also match?&amp;nbsp; Or "sanitary"?&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;One of the spelling difference functions might be useful, but would probably be very expensive.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Maybe one of the Dataflux products would do this if you have it licensed.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Tue, 24 Dec 2019 18:45:22 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Data-Management/Finding-Strings-Exact-or-Partial-Match-in-a-Line/m-p/613776#M18585</guid>
      <dc:creator>JackHamilton</dc:creator>
      <dc:date>2019-12-24T18:45:22Z</dc:date>
    </item>
    <item>
      <title>Re: Finding Strings (Exact or Partial Match) in a Line</title>
      <link>https://communities.sas.com/t5/SAS-Data-Management/Finding-Strings-Exact-or-Partial-Match-in-a-Line/m-p/613798#M18586</link>
      <description>&lt;P&gt;I agree completely with&amp;nbsp;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/13901"&gt;@JackHamilton&lt;/a&gt;,&amp;nbsp;you should look at either DataFlux or the SAS Text Analysis products.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;However, I had to do a simple version of this once, and it was easy to adapt my code for your data. Here's a test bench version of what you need that you can play around with.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Tom&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data RawText;
	length TextStr $32767;
	input;
	TextStr = _infile_;
	LineNum = _n_;
	cards4;
SUPPLY OF MANPOWER AS PER PROFORMA INVOICE DATED 14.01.2019
1. SIGNED COMMERCIAL INVOICE(S) IN 1 ORIGINAL AND 2 COPIES
SHOWING DATE OF SUPPLY OF MANPOWER NOT LATER THAN 15.05.2019 AND
DULY COUNTERSIGNED BY APPLICANTS AUTHORIZED SIGNATORY AND TO BE
AUTHENTICATED BY IRAN NATIONAL BANK, ERAN TRADE FINANCE DEPARTMENT
PRIOR PRESENTATION OF DOCS FOR NEGOTIATION.
IN THE ABSENCE OF DATE OF SUPPLY OF MANPOWER, THE DATE SHOWN ON
COMMERCIAL INVOICE WILL BE CONSIDERED AS THE SUPPLY DATE for DAWOOD HASSAN.

USD 120/- OR EQUIVALENT IN THE L/C CURRENCY AND RELATED
CHARGES SHOULD BE DEDUCTED FROM THE PAYMENT FOR EACH PRESENTATION by DAWOOD HASAN
OF DISCREPANT DOCUMENTS UNDER THIS CREDIT, NOT WITHSTANDING ANY
INSTRUCTION TO THE CONTRARY, THIS CHARGE SHALL BE FOR THE ACCOUNT
OF BENEFICIARY

2. BENEFICIARYS A/C NO.: 202-577688-001-0010-000  BIC: PIBPBG2L
APPLICANT ACCOUNT. ALL OTHER
CHARGES INCLUDING REIMBURSEMENT AND
SWIFT PAYMENTS RELATED CHARGES ARE
FOR BENEFICIARY ACCOUNT in SYRIA
WITHOUT DESPATCH FULL SET OF PRESENTED / NEGOTIATED DOCUMENTS IN ONE LOT
BY COURIER TO: QATAR NATIONAL BANK, MAIN OFFICE, GRAND HAMAD
STREET, TRADE FINANCE DEPARTMENT, IMPORTS SECTION, P.O. BOX 1000,
DOHA, QATAR.
++UPON RECEIPT OF CREDIT COMPLYING DOCUMENTS OSMA BIN LADEN PAYMENT SHALL BE
EFFECTED BY US AS PER PRESENTING BANKS INSTRUCTION.
;;;;
run;

data CompText;
	length CompStr $50;
	input;
	CompStr = _infile_;
	CompStr = upcase(CompStr);
	cards;
PIBPBG2L
OSAMA
BIN
LADEN
DAWOOD
HASSAN
SYRIA
IRAN
run;

data RawTextProcess;
	set RawText;
	TextStr = translate(TextStr, "                                ", "`~!@#$%^&amp;amp;*()-=_+[]\{}|;':"",./&amp;lt;&amp;gt;?");
	TextStr = upcase(left(compbl(TextStr)));
run;

data DeString;
	length RawWord $25;
	set RawTextProcess;
	drop TextStr;

	do WordNum = 1 to countw(TextStr);
		RawWord = scan(TextStr, WordNum);
		output;
	end;
run;

proc sql noprint;
	create table Compare as
		select c.CompStr, d.RawWord, d.LineNum, d.WordNum, compged(c.Compstr, d.RawWord) as CompGedResult, complev(c.Compstr, d.RawWord) as CompLevResult, spedis(c.Compstr, d.RawWord) as SpeDisResult
			from CompText c cross join DeString d
				order by CompLevResult;
quit;&lt;/CODE&gt;&lt;/PRE&gt;</description>
      <pubDate>Tue, 24 Dec 2019 22:45:03 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Data-Management/Finding-Strings-Exact-or-Partial-Match-in-a-Line/m-p/613798#M18586</guid>
      <dc:creator>TomKari</dc:creator>
      <dc:date>2019-12-24T22:45:03Z</dc:date>
    </item>
    <item>
      <title>Re: Finding Strings (Exact or Partial Match) in a Line</title>
      <link>https://communities.sas.com/t5/SAS-Data-Management/Finding-Strings-Exact-or-Partial-Match-in-a-Line/m-p/614515#M18613</link>
      <description>&lt;P&gt;To me, this looks like you are dealing with AML (Anti-Money Laundering) requirements that are specified in law by many governments. SAS has specific products for this, as do other vendors. If so, then providing more context to what you are doing would be helpful.&lt;/P&gt;</description>
      <pubDate>Tue, 31 Dec 2019 05:13:13 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Data-Management/Finding-Strings-Exact-or-Partial-Match-in-a-Line/m-p/614515#M18613</guid>
      <dc:creator>SASKiwi</dc:creator>
      <dc:date>2019-12-31T05:13:13Z</dc:date>
    </item>
  </channel>
</rss>

