<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Match two datasets on a partial match in SAS Programming</title>
    <link>https://communities.sas.com/t5/SAS-Programming/Match-two-datasets-on-a-partial-match/m-p/893400#M352925</link>
    <description>&lt;P&gt;Matching addresses is complicated. In fact SAS has aproduct designed to do this&amp;nbsp; - SAS Data Quality. The question I have is how do you know the examples you provide are indeed the same addresses? If you are validating these addresses for postal purposes, then an apartment number at a street address is not the same as the actual street address itself. If all you are required to do is match the street address, that might be good enough.&lt;/P&gt;</description>
    <pubDate>Fri, 08 Sep 2023 22:54:23 GMT</pubDate>
    <dc:creator>SASKiwi</dc:creator>
    <dc:date>2023-09-08T22:54:23Z</dc:date>
    <item>
      <title>Match two datasets on a partial match</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Match-two-datasets-on-a-partial-match/m-p/893399#M352924</link>
      <description>&lt;P&gt;Hello.&amp;nbsp; I'm not sure if this is even possible but thought I'd check.&amp;nbsp; I have 2 datasets.&amp;nbsp; They both have multiple variables, but the one that is similar between the two are address variables, and those are the variables I need to try and match.&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;This issue is, one set has values such as:&lt;/P&gt;&lt;P&gt;193 FAIRVIEW LN&lt;BR /&gt;1006 NUT TREE ROAD&lt;BR /&gt;6121 PASEO DEL NORTE&lt;BR /&gt;9401 PAINTER AVE&lt;BR /&gt;3744 LONG BEACH BLVD&lt;BR /&gt;11801 PIERCE ST FL&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;While the other has values similar to this:&lt;/P&gt;&lt;P&gt;193 FAIRVIEW LN STE 100&lt;BR /&gt;1006 NUT TREE ROAD APT 5&lt;BR /&gt;6121 PASEO DEL NORTE BLDNG 2&lt;BR /&gt;9401 PAINTER AVE SUITE 2A&lt;BR /&gt;3744 LONG BEACH BLVD SUITE 224&lt;BR /&gt;11801 PIERCE ST FL APT 315&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I want to create a format from one dataset and use that format on the other data set to create a new variable (say addr_match), where if theres a match, then addr_match = 1.&amp;nbsp;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Is there a way to match these variables where if part of the addresses match then it'd be addr_match = 1, or do the values in the address fields need to be exactly the same?&amp;nbsp; There are thousands of rows in each dataset.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Fri, 08 Sep 2023 22:39:06 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Match-two-datasets-on-a-partial-match/m-p/893399#M352924</guid>
      <dc:creator>JH74</dc:creator>
      <dc:date>2023-09-08T22:39:06Z</dc:date>
    </item>
    <item>
      <title>Re: Match two datasets on a partial match</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Match-two-datasets-on-a-partial-match/m-p/893400#M352925</link>
      <description>&lt;P&gt;Matching addresses is complicated. In fact SAS has aproduct designed to do this&amp;nbsp; - SAS Data Quality. The question I have is how do you know the examples you provide are indeed the same addresses? If you are validating these addresses for postal purposes, then an apartment number at a street address is not the same as the actual street address itself. If all you are required to do is match the street address, that might be good enough.&lt;/P&gt;</description>
      <pubDate>Fri, 08 Sep 2023 22:54:23 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Match-two-datasets-on-a-partial-match/m-p/893400#M352925</guid>
      <dc:creator>SASKiwi</dc:creator>
      <dc:date>2023-09-08T22:54:23Z</dc:date>
    </item>
    <item>
      <title>Re: Match two datasets on a partial match</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Match-two-datasets-on-a-partial-match/m-p/893402#M352926</link>
      <description>&lt;P&gt;Provide some example data in the form of data steps and perhaps you'll get some example code.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;If you values are that "clean", not having to match "LN" to "LANE" for example it may be quite doable.&lt;/P&gt;
&lt;P&gt;Format per se may not be the best approach though. Or you would have to code a function and then create a format using that function and may be more work that some sort of Join with a scoring function such as Verify or COMGED&lt;/P&gt;</description>
      <pubDate>Fri, 08 Sep 2023 23:17:32 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Match-two-datasets-on-a-partial-match/m-p/893402#M352926</guid>
      <dc:creator>ballardw</dc:creator>
      <dc:date>2023-09-08T23:17:32Z</dc:date>
    </item>
    <item>
      <title>Re: Match two datasets on a partial match</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Match-two-datasets-on-a-partial-match/m-p/893408#M352930</link>
      <description>&lt;P&gt;Your sample data indicates that the shorter string is exactly contained in the longer string. If that's representative for your real data then code along the line of below could work.&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data have1;
  infile datalines truncover;
  input addr_short $100.;
  datalines;
193 FAIRVIEW LN
1006 NUT TREE ROAD
6121 PASEO DEL NORTE
9401 PAINTER AVE
3744 LONG BEACH BLVD
11801 PIERCE ST FL 
;
data have2;
  infile datalines truncover;
  input addr_long $150.;
  datalines;
193 FAIRVIEW LN STE 100
1006 NUT TREE ROAD APT 5
6121 PASEO DEL NORTE BLDNG 2
9401 PAINTER AVE SUITE 2A
3744 LONG BEACH BLVD SUITE 224
11801 PIERCE ST FL APT 315
;

/* using SAS find() function */
proc sql;
  select h1.addr_short, h2.addr_long
  from have1 h1 inner join have2 h2
  on find(h2.addr_long,strip(h1.addr_short))&amp;gt;0
  ;
quit;

/* using SQL like operator */
proc sql;
  select h1.addr_short, h2.addr_long
  from have1 h1 inner join have2 h2
  on h2.addr_long like cats(h1.addr_short,'%')
  ;
quit;
&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="Patrick_0-1694222820971.png" style="width: 400px;"&gt;&lt;img src="https://communities.sas.com/t5/image/serverpage/image-id/87739iC90A9BEFA26A7A47/image-size/medium?v=v2&amp;amp;px=400" role="button" title="Patrick_0-1694222820971.png" alt="Patrick_0-1694222820971.png" /&gt;&lt;/span&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Sat, 09 Sep 2023 01:42:39 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Match-two-datasets-on-a-partial-match/m-p/893408#M352930</guid>
      <dc:creator>Patrick</dc:creator>
      <dc:date>2023-09-09T01:42:39Z</dc:date>
    </item>
    <item>
      <title>Re: Match two datasets on a partial match</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Match-two-datasets-on-a-partial-match/m-p/893631#M353032</link>
      <description>&lt;P&gt;Thank you for this.&amp;nbsp; I've ran this and I do get results showing matches.&amp;nbsp; I want to output all matching values into a new dataset (move values from the h1 dataset that match the h2 values).&amp;nbsp; Using the code you provided, is there a step I can add to output the results?&amp;nbsp; Or can I do this in a different way that can output a new dataset that contains the matches?&lt;/P&gt;</description>
      <pubDate>Mon, 11 Sep 2023 19:50:14 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Match-two-datasets-on-a-partial-match/m-p/893631#M353032</guid>
      <dc:creator>JH74</dc:creator>
      <dc:date>2023-09-11T19:50:14Z</dc:date>
    </item>
    <item>
      <title>Re: Match two datasets on a partial match</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Match-two-datasets-on-a-partial-match/m-p/893634#M353034</link>
      <description>&lt;P&gt;To create a dataset of the matched results just add a CREATE TABLE to the query:&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;proc sql;
  create table Want as
  select h1.addr_short, h2.addr_long
  from have1 h1 inner join have2 h2
  on h2.addr_long like cats(h1.addr_short,'%')
  ;
quit;&lt;/CODE&gt;&lt;/PRE&gt;</description>
      <pubDate>Mon, 11 Sep 2023 20:25:00 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Match-two-datasets-on-a-partial-match/m-p/893634#M353034</guid>
      <dc:creator>SASKiwi</dc:creator>
      <dc:date>2023-09-11T20:25:00Z</dc:date>
    </item>
    <item>
      <title>Re: Match two datasets on a partial match</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Match-two-datasets-on-a-partial-match/m-p/893650#M353041</link>
      <description>Shorter string in a longer string&lt;BR /&gt;---h2.addr_long like cats(h1.addr_short,'%')&lt;BR /&gt;Worth the price of admission right there.</description>
      <pubDate>Mon, 11 Sep 2023 21:17:46 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Match-two-datasets-on-a-partial-match/m-p/893650#M353041</guid>
      <dc:creator>HB</dc:creator>
      <dc:date>2023-09-11T21:17:46Z</dc:date>
    </item>
  </channel>
</rss>

