<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Question about Conditional Merging in SAS Procedures</title>
    <link>https://communities.sas.com/t5/SAS-Procedures/Question-about-Conditional-Merging/m-p/106896#M29785</link>
    <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;ha. geez. yea, that worked. Thanks you both!&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
    <pubDate>Tue, 08 Oct 2013 17:35:07 GMT</pubDate>
    <dc:creator>robcafaro</dc:creator>
    <dc:date>2013-10-08T17:35:07Z</dc:date>
    <item>
      <title>Question about Conditional Merging</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/Question-about-Conditional-Merging/m-p/106891#M29780</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;Hello! I'm having some issues conditionally merging two datasets. This is what I want to do:&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;I have an hmda data set (&amp;gt; 2 million observations) and ahs data set (~60,000 observations). I want to merge by metropolitan statistical area, race, sex, and here's where the conditional part comes in:&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;The last variable I want to join by is when ahsincome = hmdaincome or ahsincome &amp;gt; hmdaincome. I can get up to this point with this code:&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P style="padding-left: 360px;"&gt;proc sql;&lt;/P&gt;&lt;P style="padding-left: 360px;"&gt;create table&lt;STRONG&gt; hmda_ahs&lt;/STRONG&gt; as&lt;/P&gt;&lt;P style="padding-left: 360px;"&gt;select L.*, R.*&lt;/P&gt;&lt;P style="padding-left: 360px;"&gt;from &lt;STRONG&gt;ahs&lt;/STRONG&gt; as L&lt;/P&gt;&lt;P style="padding-left: 360px;"&gt;left join &lt;STRONG&gt;clean_hmda&lt;/STRONG&gt; as R&lt;/P&gt;&lt;P style="padding-left: 360px;"&gt;on&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &lt;STRONG&gt; L.smsa&lt;/STRONG&gt;=&lt;STRONG&gt;R.smsa&lt;/STRONG&gt;&lt;/P&gt;&lt;P style="padding-left: 360px;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; and&lt;STRONG&gt; L.race&lt;/STRONG&gt;=&lt;STRONG&gt;R.race&lt;/STRONG&gt;&lt;/P&gt;&lt;P style="padding-left: 360px;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; and &lt;STRONG&gt;L.sex&lt;/STRONG&gt;=&lt;STRONG&gt;R.sex&lt;/STRONG&gt;&lt;/P&gt;&lt;P style="padding-left: 360px;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; and (&lt;STRONG&gt;L.ahsincome&lt;/STRONG&gt; = &lt;STRONG&gt;R.hmdaincome&lt;/STRONG&gt; or &lt;STRONG&gt;L.ahsincome&lt;/STRONG&gt; &amp;gt; &lt;STRONG&gt;R.hmdaincome&lt;/STRONG&gt;);&lt;/P&gt;&lt;P style="padding-left: 360px;"&gt;quit;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Now, obviously, this leaves me with a merged data set (hmda_ahs) that's ridiculously large. What I really want to end up with is a merged data set where &lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;(L.ahsincome&lt;/STRONG&gt; = &lt;STRONG&gt;R.hmdaincome&lt;/STRONG&gt; or &lt;STRONG&gt;L.ahsincome&lt;/STRONG&gt; &amp;gt; &lt;STRONG&gt;R.hmdaincome&lt;/STRONG&gt;) is true, but only retains the minimum difference between ahsincome and hmdaincome across the matches, so it cuts down on all of the duplicate matches. Is this a possible condition to add to the proc sql (I've tried, but can't seem to put that condition in), can I do it with a merge statement, or where else should I be looking? Or, is it possible to keep only the observations that meet my criterion in the humongous merged data set hmda_ahs?&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;I look forward to some ideas. Thanks!&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Rob&lt;/P&gt;&lt;P&gt;&lt;A class="jive-link-email-small" href="mailto:rcafaro2@uncc.edu"&gt;rcafaro2@uncc.edu&lt;/A&gt;&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Tue, 08 Oct 2013 16:28:05 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/Question-about-Conditional-Merging/m-p/106891#M29780</guid>
      <dc:creator>robcafaro</dc:creator>
      <dc:date>2013-10-08T16:28:05Z</dc:date>
    </item>
    <item>
      <title>Re: Question about Conditional Merging</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/Question-about-Conditional-Merging/m-p/106892#M29781</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;Untested at all....but I think your last grouped condition is also equal to &amp;gt;= rather than separate conditions with an or. You may need to break that up or put it in a having clause.&lt;/P&gt;&lt;P&gt;An easy alternative is order by the variables and then taking the first one in a datastep. &lt;/P&gt;&lt;P style="padding-left: 360px;"&gt;proc sql;&lt;/P&gt;&lt;P style="padding-left: 360px;"&gt;create table&lt;STRONG&gt; hmda_ahs&lt;/STRONG&gt; as&lt;/P&gt;&lt;P style="padding-left: 360px;"&gt;select L.*, R.*, min(ahsincome-hmdaincome) as min_income&lt;/P&gt;&lt;P style="padding-left: 360px;"&gt;from &lt;STRONG&gt;ahs&lt;/STRONG&gt; as L&lt;/P&gt;&lt;P style="padding-left: 360px;"&gt;left join &lt;STRONG&gt;clean_hmda&lt;/STRONG&gt; as R&lt;/P&gt;&lt;P style="padding-left: 360px;"&gt;on&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &lt;STRONG&gt; L.smsa&lt;/STRONG&gt;=&lt;STRONG&gt;R.smsa&lt;/STRONG&gt;&lt;/P&gt;&lt;P style="padding-left: 360px;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; and&lt;STRONG&gt; L.race&lt;/STRONG&gt;=&lt;STRONG&gt;R.race&lt;/STRONG&gt;&lt;/P&gt;&lt;P style="padding-left: 360px;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; and &lt;STRONG&gt;L.sex&lt;/STRONG&gt;=&lt;STRONG&gt;R.sex&lt;/STRONG&gt;&lt;/P&gt;&lt;P style="padding-left: 360px;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; and &lt;STRONG&gt;L.ahsincome&lt;/STRONG&gt; &amp;gt;= &lt;STRONG&gt;R.hmdaincome&lt;/STRONG&gt;&lt;/P&gt;&lt;P style="padding-left: 360px;"&gt;group by smsa, race, sex&lt;/P&gt;&lt;P style="padding-left: 360px;"&gt;where (ahsincome-hmdaincome)=calculated min_income&amp;nbsp; ;&lt;/P&gt;&lt;P style="padding-left: 360px;"&gt;quit;&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Tue, 08 Oct 2013 16:41:39 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/Question-about-Conditional-Merging/m-p/106892#M29781</guid>
      <dc:creator>Reeza</dc:creator>
      <dc:date>2013-10-08T16:41:39Z</dc:date>
    </item>
    <item>
      <title>Re: Question about Conditional Merging</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/Question-about-Conditional-Merging/m-p/106893#M29782</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;A small tweak ... when using GROUP BY and subsetting the RESULTS rather than subsetting the incoming data, use HAVING instead of WHERE.&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Tue, 08 Oct 2013 17:07:26 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/Question-about-Conditional-Merging/m-p/106893#M29782</guid>
      <dc:creator>Astounding</dc:creator>
      <dc:date>2013-10-08T17:07:26Z</dc:date>
    </item>
    <item>
      <title>Re: Question about Conditional Merging</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/Question-about-Conditional-Merging/m-p/106894#M29783</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;I just tried that code. It makes sense, but I'm getting an ambiguous reference error message (column smsa, race, and sex are in more than one table). How do I adjust for that?&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Tue, 08 Oct 2013 17:25:19 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/Question-about-Conditional-Merging/m-p/106894#M29783</guid>
      <dc:creator>robcafaro</dc:creator>
      <dc:date>2013-10-08T17:25:19Z</dc:date>
    </item>
    <item>
      <title>Re: Question about Conditional Merging</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/Question-about-Conditional-Merging/m-p/106895#M29784</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;Refer to the them with the appropriate alias, eg.&amp;nbsp; L.smsa &lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Tue, 08 Oct 2013 17:30:08 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/Question-about-Conditional-Merging/m-p/106895#M29784</guid>
      <dc:creator>Reeza</dc:creator>
      <dc:date>2013-10-08T17:30:08Z</dc:date>
    </item>
    <item>
      <title>Re: Question about Conditional Merging</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/Question-about-Conditional-Merging/m-p/106896#M29785</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;ha. geez. yea, that worked. Thanks you both!&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Tue, 08 Oct 2013 17:35:07 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/Question-about-Conditional-Merging/m-p/106896#M29785</guid>
      <dc:creator>robcafaro</dc:creator>
      <dc:date>2013-10-08T17:35:07Z</dc:date>
    </item>
    <item>
      <title>Re: Question about Conditional Merging</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/Question-about-Conditional-Merging/m-p/106897#M29786</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;Ok, so I just tried to replicate this code with an updated dataset, but for some reason the results aren't right (as in, the final number of observations is way more than it should be). So, this time, I'm merging HMDA data (1,906,993 observations) with AHS data (22,511 observations). This is the code I am using:&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Proc sql;&lt;/P&gt;&lt;P&gt;create table AHS_HMDA as&lt;/P&gt;&lt;P&gt;select L.* R.*, min(hmdaincome-ahsincome) as min_income&lt;/P&gt;&lt;P&gt;from ahs as L&lt;/P&gt;&lt;P&gt;left join hmda as R on&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; L.smsa=R.smsa&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; and L.race=R.race&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; and L.sex=R.sex&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; and L.ahsincome &amp;lt;= R.hmdaincome&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; group by L.smsa, L.race, L.sex&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; having (hmdaincome-ahsincome) = calculated min_income;&lt;/P&gt;&lt;P&gt;quit;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;This process is now resulting in over 100,000 observations, when there should not be any more observations than the AHS dataset (22,511). I'm basically trying to find matches between the two datasets, and only keeping the minimized difference between the income values of both datasets.... Last time I ran the code we had discussed before, I was left with a lot less observations. Not sure why it's different.&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Tue, 29 Oct 2013 18:28:29 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/Question-about-Conditional-Merging/m-p/106897#M29786</guid>
      <dc:creator>robcafaro</dc:creator>
      <dc:date>2013-10-29T18:28:29Z</dc:date>
    </item>
    <item>
      <title>Re: Question about Conditional Merging</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/Question-about-Conditional-Merging/m-p/106898#M29787</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;Do you have distinct SMSA in each file?&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Tue, 29 Oct 2013 18:48:19 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/Question-about-Conditional-Merging/m-p/106898#M29787</guid>
      <dc:creator>Reeza</dc:creator>
      <dc:date>2013-10-29T18:48:19Z</dc:date>
    </item>
    <item>
      <title>Re: Question about Conditional Merging</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/Question-about-Conditional-Merging/m-p/106899#M29788</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;It's a many to many merging process, so similar smsa's are found across both datasets, but multiple matches are made. There is no unique identifier in either data set. One other way I'm seriously considering to merge the data sets is to create bounds around AHS income (say, keep the minimum observation of income in which hmdaincome falls within a +/-$2000 range around ahsincome). This would lead to more observations, but not more than what I have within the ahs dataset itself. Any help on that would be much appreciated as well. &lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Tue, 29 Oct 2013 23:10:26 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/Question-about-Conditional-Merging/m-p/106899#M29788</guid>
      <dc:creator>robcafaro</dc:creator>
      <dc:date>2013-10-29T23:10:26Z</dc:date>
    </item>
    <item>
      <title>Re: Question about Conditional Merging</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/Question-about-Conditional-Merging/m-p/106900#M29789</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;More specifically, what I would like to do is &lt;SPAN style="font-family: Tahoma, Geneva, sans-serif; font-size: small;"&gt;place a bandwidth restriction around ahsincome that hmdaincome may fall into, then keep the kernel-weighted average within that bandwidth (say, +/- $2000 around ahsincome). This should bring my number of observations down significantly, but still yield (from previous research) close to 1000 observations. Any pointers?&lt;/SPAN&gt;&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Wed, 30 Oct 2013 00:33:42 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/Question-about-Conditional-Merging/m-p/106900#M29789</guid>
      <dc:creator>robcafaro</dc:creator>
      <dc:date>2013-10-30T00:33:42Z</dc:date>
    </item>
  </channel>
</rss>

