<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Match in SAS Programming</title>
    <link>https://communities.sas.com/t5/SAS-Programming/Match/m-p/752852#M237182</link>
    <description>&lt;P&gt;Thank you&amp;nbsp;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/12447"&gt;@Patrick&lt;/a&gt;&amp;nbsp;. I was not aware of this function. I am definitely going to read, understand and test this for my learning and use it. Appreciate it.&amp;nbsp;&lt;/P&gt;</description>
    <pubDate>Thu, 08 Jul 2021 12:04:15 GMT</pubDate>
    <dc:creator>Anuz</dc:creator>
    <dc:date>2021-07-08T12:04:15Z</dc:date>
    <item>
      <title>Match</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Match/m-p/752340#M236962</link>
      <description>&lt;P&gt;&lt;FONT size="1 2 3 4 5 6 7"&gt;Hi All,&lt;/FONT&gt;&lt;/P&gt;
&lt;P&gt;&lt;FONT size="1 2 3 4 5 6 7"&gt;Looking for best options to match values in two tables&lt;/FONT&gt;&lt;/P&gt;
&lt;P&gt;&lt;FONT size="1 2 3 4 5 6 7"&gt;I have one dataset that has the below data that has company names&lt;/FONT&gt;&lt;/P&gt;
&lt;P&gt;&lt;FONT size="1 2 3 4 5 6 7"&gt;data COMPANY_DATA;&lt;/FONT&gt;&lt;/P&gt;
&lt;P&gt;&lt;FONT size="1 2 3 4 5 6 7"&gt;input ENUM $ COMPANY_NAME $ ;&lt;/FONT&gt;&lt;/P&gt;
&lt;P&gt;&lt;FONT size="1 2 3 4 5 6 7"&gt;datalines;&lt;/FONT&gt;&lt;/P&gt;
&lt;P&gt;&lt;FONT size="1 2 3 4 5 6 7"&gt;D1234 COMPAX LTD&lt;/FONT&gt;&lt;/P&gt;
&lt;P&gt;&lt;FONT size="1 2 3 4 5 6 7"&gt;D1256 TEST NOTE LTD&lt;/FONT&gt;&lt;/P&gt;
&lt;P&gt;&lt;FONT size="1 2 3 4 5 6 7"&gt;D1345 LISTKK ENTERPRISES&lt;/FONT&gt;&lt;/P&gt;
&lt;P&gt;&lt;FONT size="1 2 3 4 5 6 7"&gt;D2234 ZIVOKA&lt;/FONT&gt;&lt;/P&gt;
&lt;P&gt;&lt;FONT size="1 2 3 4 5 6 7"&gt;D4534 LIBORD NUKA PVT&lt;/FONT&gt;&lt;/P&gt;
&lt;P&gt;&lt;FONT size="1 2 3 4 5 6 7"&gt;D7887 ZIMZUM COLLEGE&lt;/FONT&gt;&lt;/P&gt;
&lt;P&gt;&lt;FONT size="1 2 3 4 5 6 7"&gt;;&lt;/FONT&gt;&lt;/P&gt;
&lt;P&gt;&lt;FONT size="1 2 3 4 5 6 7"&gt;run;&lt;/FONT&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;FONT size="1 2 3 4 5 6 7"&gt;I have another two datasets that has possible first names and second names&lt;/FONT&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;FONT size="1 2 3 4 5 6 7"&gt;data firstnames;&lt;/FONT&gt;&lt;/P&gt;
&lt;P&gt;&lt;FONT size="1 2 3 4 5 6 7"&gt;fname $;&lt;/FONT&gt;&lt;/P&gt;
&lt;P&gt;&lt;FONT size="1 2 3 4 5 6 7"&gt;peter&lt;/FONT&gt;&lt;/P&gt;
&lt;P&gt;&lt;FONT size="1 2 3 4 5 6 7"&gt;sam&lt;/FONT&gt;&lt;/P&gt;
&lt;P&gt;&lt;FONT size="1 2 3 4 5 6 7"&gt;zivo&lt;/FONT&gt;&lt;/P&gt;
&lt;P&gt;&lt;FONT size="1 2 3 4 5 6 7"&gt;zimzumtu&lt;/FONT&gt;&lt;/P&gt;
&lt;P&gt;&lt;FONT size="1 2 3 4 5 6 7"&gt;run;&lt;/FONT&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;FONT size="1 2 3 4 5 6 7"&gt;data secondnames;&lt;/FONT&gt;&lt;/P&gt;
&lt;P&gt;&lt;FONT size="1 2 3 4 5 6 7"&gt;sname $;&lt;/FONT&gt;&lt;/P&gt;
&lt;P&gt;&lt;FONT size="1 2 3 4 5 6 7"&gt;peterson&lt;/FONT&gt;&lt;/P&gt;
&lt;P&gt;&lt;FONT size="1 2 3 4 5 6 7"&gt;kane&lt;/FONT&gt;&lt;/P&gt;
&lt;P&gt;&lt;FONT size="1 2 3 4 5 6 7"&gt;bargh&lt;/FONT&gt;&lt;/P&gt;
&lt;P&gt;&lt;FONT size="1 2 3 4 5 6 7"&gt;anderson&lt;/FONT&gt;&lt;/P&gt;
&lt;P&gt;&lt;FONT size="1 2 3 4 5 6 7"&gt;run;&lt;/FONT&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;FONT size="1 2 3 4 5 6 7"&gt;The first names and second names dataset is fairly.&amp;nbsp;&lt;/FONT&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;FONT size="1 2 3 4 5 6 7"&gt;What I want to highlight for each row in the&amp;nbsp;COMPANY_DATA dataset is if there is rough match to a primary and/or second name in the company name partially or completely. For example there could be a primary name say mark and the company name has Marke in it..it should be a possible match but if the company name has Market Research then that is not a possible match. So basically as close to the primary name listed in the firstnames dataset.&amp;nbsp;&lt;/FONT&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;FONT size="1 2 3 4 5 6 7"&gt;What is the best option to achieve the above&amp;nbsp; ? Thank you in advance.&amp;nbsp;&lt;/FONT&gt;&lt;/P&gt;</description>
      <pubDate>Tue, 06 Jul 2021 15:24:28 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Match/m-p/752340#M236962</guid>
      <dc:creator>Anuz</dc:creator>
      <dc:date>2021-07-06T15:24:28Z</dc:date>
    </item>
    <item>
      <title>Re: Match</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Match/m-p/752358#M236975</link>
      <description>&lt;P&gt;let me know if you need more clarity on my query. or if I need to explain it better&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Tue, 06 Jul 2021 17:11:59 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Match/m-p/752358#M236975</guid>
      <dc:creator>Anuz</dc:creator>
      <dc:date>2021-07-06T17:11:59Z</dc:date>
    </item>
    <item>
      <title>Re: Match</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Match/m-p/752382#M236991</link>
      <description>&lt;P&gt;Hi&amp;nbsp;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/382230"&gt;@Anuz&lt;/a&gt;,&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Thanks for supplying the data in data steps. For the benefit of all I've tried to make the code more readable and some of the steps needed tweaking (see further below).&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Please supply another data step with &lt;FONT face="courier new,courier"&gt;datalines&lt;/FONT&gt; showing the resulting data you want to see in the output, based on the input data you have given.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data COMPANY_DATA;
	input ENUM $ COMPANY_NAME $ ;

	datalines;
D1234 COMPAX LTD
D1256 TEST NOTE LTD
D1345 LISTKK ENTERPRISES
D2234 ZIVOKA
D4534 LIBORD NUKA PVT
D7887 ZIMZUM COLLEGE
;


data firstnames;
	input fname $;

	datalines;
peter
sam
zivo
zimzumtu
;


data secondnames;
	input sname $;

	datalines;
peterson
kane
bargh
anderson
;

&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Thanks &amp;amp; kind regards,&lt;/P&gt;
&lt;P&gt;Amir.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Tue, 06 Jul 2021 18:31:41 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Match/m-p/752382#M236991</guid>
      <dc:creator>Amir</dc:creator>
      <dc:date>2021-07-06T18:31:41Z</dc:date>
    </item>
    <item>
      <title>Re: Match</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Match/m-p/752385#M236994</link>
      <description>&lt;P&gt;Thank you&amp;nbsp;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/22588"&gt;@Amir&lt;/a&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Something like the below as output would be good as it will be useful for the end user to understand the analysis. However I am open to any other suggestion of output.&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;this is on the assumption that in the firstnames dataset -&amp;nbsp; zivoka , alex and zimzum are listed as accepted first names.&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Hope this helps.&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data COMPANY_DATA;
	input ENUM $5. COMPANY_NAME $25. FNAME_FLAG $1. FNAME_POS 8.;

	datalines;
D1234 COMPAX LTD              N   0 
D1256 TEST NOTE LTD           N   0
D1345 LISTKK ENTERPRISES      N   0 
D2234 ZIVOKA                  Y   1
D4534 NUKA ALEX PVT           Y   5  
D7887 ZIMZUM COLLEGE          Y   1
;

run;&lt;/CODE&gt;&lt;/PRE&gt;</description>
      <pubDate>Tue, 06 Jul 2021 18:58:24 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Match/m-p/752385#M236994</guid>
      <dc:creator>Anuz</dc:creator>
      <dc:date>2021-07-06T18:58:24Z</dc:date>
    </item>
    <item>
      <title>Re: Match</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Match/m-p/752436#M237026</link>
      <description>&lt;P&gt;You mention that:&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;BLOCKQUOTE&gt;
&lt;P&gt;&lt;SPAN&gt;this is on the assumption that in the firstnames dataset -&amp;nbsp; zivoka , alex and zimzum are listed as accepted first names.&amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;
&lt;/BLOCKQUOTE&gt;
&lt;OL&gt;
&lt;LI&gt;Does "accepted first names" mean the names on the firstnames data set? If not, can you please supply another data set with the accepted names or precisely define the rules of what is accepted.&lt;/LI&gt;
&lt;LI&gt;"alex" does not appear in the input data you supplied for the firstnames data set, should it be on the input data set firstnames?&lt;/LI&gt;
&lt;LI&gt;The closest I can see to "zivoka" on the firstnames data set is "zivo", does this mean the fname can be shorter than the name it matches with?&lt;/LI&gt;
&lt;LI&gt;The&amp;nbsp;closest I can see to "zimzum" on the firstnames data set is "zimzumtu", does this mean the fname can be longer than the name it matches with?&lt;/LI&gt;
&lt;LI&gt;What is the maximum length the names can be different by?&lt;/LI&gt;
&lt;LI&gt;Can the difference only be at the end of the name or can it be at the beginning or up to a specific number of characters anywhere in the fname?&lt;/LI&gt;
&lt;LI&gt;If the input data in the firstnames data set needs to be corrected then please edit your original question and make the correction.&lt;/LI&gt;
&lt;/OL&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Thanks &amp;amp; kind regards,&lt;/P&gt;
&lt;P&gt;Amir.&lt;/P&gt;</description>
      <pubDate>Tue, 06 Jul 2021 21:26:44 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Match/m-p/752436#M237026</guid>
      <dc:creator>Amir</dc:creator>
      <dc:date>2021-07-06T21:26:44Z</dc:date>
    </item>
    <item>
      <title>Re: Match</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Match/m-p/752442#M237028</link>
      <description>&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data firstnames;
	input fname $;

	datalines;
peter
sam
zivoka
zimzum
alex
;&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;Hi &lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/22588"&gt;@Amir&lt;/a&gt;&amp;nbsp;. Please&amp;nbsp;take&amp;nbsp;the&amp;nbsp;above&amp;nbsp;for&amp;nbsp;the&amp;nbsp;firstnames&amp;nbsp;dataset.&amp;nbsp;&lt;BR /&gt;&lt;BR /&gt;You are right, accepted first names means the names in the firstnames dataset. &lt;BR /&gt;&lt;BR /&gt;The&amp;nbsp;values&amp;nbsp;in&amp;nbsp;the&amp;nbsp;firstnames&amp;nbsp;dataset&amp;nbsp;if&amp;nbsp;found&amp;nbsp;anywhere&amp;nbsp;in&amp;nbsp;the&amp;nbsp;company&amp;nbsp;dataset&amp;nbsp;then&amp;nbsp;that&amp;nbsp;needs&amp;nbsp;to&amp;nbsp;be&amp;nbsp;highlighted/identified&amp;nbsp;by&amp;nbsp;the&amp;nbsp;code.&lt;BR /&gt;So&amp;nbsp;for&amp;nbsp;example&amp;nbsp;if&amp;nbsp;the&amp;nbsp;company&amp;nbsp;name&amp;nbsp;is&amp;nbsp;&lt;BR /&gt;a. PoundAlex&amp;nbsp;Ltd or &lt;BR /&gt;b. AlexPound Ltd or &lt;BR /&gt;c. Alex Ltd or &lt;BR /&gt;d. Boxing Alex Ltd &lt;BR /&gt;&amp;nbsp;in&amp;nbsp;all the both&amp;nbsp;the&amp;nbsp;cases&amp;nbsp;Alex&amp;nbsp;needs&amp;nbsp;to&amp;nbsp;be&amp;nbsp;identified.&amp;nbsp;&lt;BR /&gt;&lt;BR /&gt;Hope&amp;nbsp;I have&amp;nbsp;managed&amp;nbsp;to&amp;nbsp;clarify&amp;nbsp;all&amp;nbsp;the&amp;nbsp;questions&amp;nbsp;you&amp;nbsp;raised.&amp;nbsp;Thank&amp;nbsp;you&lt;/CODE&gt;&lt;/PRE&gt;</description>
      <pubDate>Tue, 06 Jul 2021 21:40:31 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Match/m-p/752442#M237028</guid>
      <dc:creator>Anuz</dc:creator>
      <dc:date>2021-07-06T21:40:31Z</dc:date>
    </item>
    <item>
      <title>Re: Match</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Match/m-p/752740#M237147</link>
      <description>&lt;P&gt;Due to lack of time, this is not a full solution and there are probably better ways, but I amended the input data to reflect the company names you showed in your output and added an extra row to secondnames to make sure the matching process worked, as the original data in secondnames had no matches.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Extra processing will be required to remove the extra rows that are generated by the &lt;FONT face="courier new,courier"&gt;proc sql&lt;/FONT&gt;, but this should give you some ideas on how your data might be processed:&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data COMPANY_DATA;
	input ENUM $5. COMPANY_NAME $25.;

	datalines;
D1234 COMPAX LTD
D1256 TEST NOTE LTD
D1345 LISTKK ENTERPRISES
D2234 ZIVOKA
D4534 NUKA ALEX PVT
D7887 ZIMZUM COLLEGE
;


data firstnames;
	input fname $;

	datalines;
peter
sam
zivoka
zimzum
alex
;


data secondnames;
	input sname $;

	datalines;
peterson
kane
bargh
anderson
note
;


proc sql noprint;
	create table
		combined
	as
	select distinct
		 c.*
		,ifc(find(company_name,fname,'it'),'Y','N') as fname_flag
		,find(company_name,fname,'it')              as fname_pos
		,ifc(find(company_name,sname,'it'),'Y','N') as sname_flag
		,find(company_name,sname,'it')              as sname_pos
		
	from
		 company_data c
		,firstnames
		,secondnames
	;
quit;

&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Thanks &amp;amp; kind regards,&lt;/P&gt;
&lt;P&gt;Amir.&lt;/P&gt;</description>
      <pubDate>Wed, 07 Jul 2021 22:25:31 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Match/m-p/752740#M237147</guid>
      <dc:creator>Amir</dc:creator>
      <dc:date>2021-07-07T22:25:31Z</dc:date>
    </item>
    <item>
      <title>Re: Match</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Match/m-p/752782#M237156</link>
      <description>&lt;P&gt;If you have the required SAS module licensed (and properly configured...) then have a look into the DQ functions like DQMATCH().&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;A href="https://go.documentation.sas.com/doc/en/pgmsascdc/9.4_3.4/dqclref/p09nffezbjyj4on11oblz77aq1x6.htm" target="_blank"&gt;https://go.documentation.sas.com/doc/en/pgmsascdc/9.4_3.4/dqclref/p09nffezbjyj4on11oblz77aq1x6.htm&lt;/A&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;These DQ functions allow for example to tokenize a string (=split it up into its parts like company name and address components) or to create match codes which you then can use to detect similar names that are just spelled differently.&lt;/P&gt;
&lt;P&gt;These functions use a knowledge base (set of data and rules) and though for example a lot of company names and spelling variations are already pre-defined to then result in the same match code. Match codes are like a cluster ID - and whatever is in the same cluster has a high probability to be the same even if spelled differently.&lt;/P&gt;</description>
      <pubDate>Thu, 08 Jul 2021 07:33:23 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Match/m-p/752782#M237156</guid>
      <dc:creator>Patrick</dc:creator>
      <dc:date>2021-07-08T07:33:23Z</dc:date>
    </item>
    <item>
      <title>Re: Match</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Match/m-p/752850#M237180</link>
      <description>&lt;P&gt;Thank you&amp;nbsp;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/22588"&gt;@Amir&lt;/a&gt;&amp;nbsp;.&lt;/P&gt;
&lt;P&gt;Appreciate your time and patience. This works for me. I was able to use your method and adapt it to get an output that satisfies my requirement. Thank you again. Good day.&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Thu, 08 Jul 2021 12:02:53 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Match/m-p/752850#M237180</guid>
      <dc:creator>Anuz</dc:creator>
      <dc:date>2021-07-08T12:02:53Z</dc:date>
    </item>
    <item>
      <title>Re: Match</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Match/m-p/752852#M237182</link>
      <description>&lt;P&gt;Thank you&amp;nbsp;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/12447"&gt;@Patrick&lt;/a&gt;&amp;nbsp;. I was not aware of this function. I am definitely going to read, understand and test this for my learning and use it. Appreciate it.&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Thu, 08 Jul 2021 12:04:15 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Match/m-p/752852#M237182</guid>
      <dc:creator>Anuz</dc:creator>
      <dc:date>2021-07-08T12:04:15Z</dc:date>
    </item>
  </channel>
</rss>

