<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Data matching- only one firm is chosen to be matched against the original dataset in SAS Procedures</title>
    <link>https://communities.sas.com/t5/SAS-Procedures/Data-matching-only-one-firm-is-chosen-to-be-matched-against-the/m-p/144754#M38438</link>
    <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;I have 2 datasets, one is small sample of 741 firms and I need to find a match for each firm from a dataset of 8330 firms using criteria of same financial year (fyear) and same industry (ffind) and take the firm with approximately the same size.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;I&amp;nbsp; have used the following sas codes to select matched pair. Now i found out that a same firm can be selected several times as a matched pair for the original dataset of 741 firms. How do I avoid the same firm to be selected again?&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;proc sort data=huang_741a;&lt;/P&gt;&lt;P&gt;by gvkey fyear;&lt;/P&gt;&lt;P&gt;run;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;proc sort data=huang_bal_04a;&lt;/P&gt;&lt;P&gt;by gvkey fyear;&lt;/P&gt;&lt;P&gt;run;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;proc sql;&lt;/P&gt;&lt;P&gt;create table control_741 as&lt;/P&gt;&lt;P&gt;select O.*, A.gvkey as Anum, A.fyear as Afy, abs(O.size-A.size) as sizeDiff&lt;/P&gt;&lt;P&gt;from huang_bal_04a as O inner join huang_741a as A &lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; on O.fYear=A.fyear and O.ffind=A.ffind&lt;/P&gt;&lt;P&gt;where O.gvkey not in (select gvkey from huang_741a) &lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; order by gvkey;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; *27001 rows**;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;proc sort data=control_741;&lt;/P&gt;&lt;P&gt;by anum afy;&lt;/P&gt;&lt;P&gt;run;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;proc means data=control_741 noprint;&lt;/P&gt;&lt;P&gt;by Anum afy;&lt;/P&gt;&lt;P&gt;output out=selected_741 idgroup(min(sizeDiff) out[1] (fyear gvkey)=);&lt;/P&gt;&lt;P&gt;run;&lt;/P&gt;&lt;P&gt;**741**;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;proc sql;&lt;/P&gt;&lt;P&gt;create table selected_741A as&lt;/P&gt;&lt;P&gt;select a.*, b.*&lt;/P&gt;&lt;P&gt;from selected_741 a left join&amp;nbsp; huang_bal_04a b&lt;/P&gt;&lt;P&gt;on a.gvkey=b.gvkey and a.fyear=b.fYear&lt;/P&gt;&lt;P&gt;order by a.gvkey, a.fyear;&lt;/P&gt;&lt;P&gt;run;&lt;/P&gt;&lt;P&gt;**597**;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;data combined_741;&lt;/P&gt;&lt;P&gt;set huang_741a selected_741A;&lt;/P&gt;&lt;P&gt;run;&lt;/P&gt;&lt;P&gt;**741+741=1282**;&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
    <pubDate>Tue, 28 Oct 2014 07:44:28 GMT</pubDate>
    <dc:creator>mei</dc:creator>
    <dc:date>2014-10-28T07:44:28Z</dc:date>
    <item>
      <title>Data matching- only one firm is chosen to be matched against the original dataset</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/Data-matching-only-one-firm-is-chosen-to-be-matched-against-the/m-p/144754#M38438</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;I have 2 datasets, one is small sample of 741 firms and I need to find a match for each firm from a dataset of 8330 firms using criteria of same financial year (fyear) and same industry (ffind) and take the firm with approximately the same size.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;I&amp;nbsp; have used the following sas codes to select matched pair. Now i found out that a same firm can be selected several times as a matched pair for the original dataset of 741 firms. How do I avoid the same firm to be selected again?&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;proc sort data=huang_741a;&lt;/P&gt;&lt;P&gt;by gvkey fyear;&lt;/P&gt;&lt;P&gt;run;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;proc sort data=huang_bal_04a;&lt;/P&gt;&lt;P&gt;by gvkey fyear;&lt;/P&gt;&lt;P&gt;run;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;proc sql;&lt;/P&gt;&lt;P&gt;create table control_741 as&lt;/P&gt;&lt;P&gt;select O.*, A.gvkey as Anum, A.fyear as Afy, abs(O.size-A.size) as sizeDiff&lt;/P&gt;&lt;P&gt;from huang_bal_04a as O inner join huang_741a as A &lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; on O.fYear=A.fyear and O.ffind=A.ffind&lt;/P&gt;&lt;P&gt;where O.gvkey not in (select gvkey from huang_741a) &lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; order by gvkey;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; *27001 rows**;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;proc sort data=control_741;&lt;/P&gt;&lt;P&gt;by anum afy;&lt;/P&gt;&lt;P&gt;run;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;proc means data=control_741 noprint;&lt;/P&gt;&lt;P&gt;by Anum afy;&lt;/P&gt;&lt;P&gt;output out=selected_741 idgroup(min(sizeDiff) out[1] (fyear gvkey)=);&lt;/P&gt;&lt;P&gt;run;&lt;/P&gt;&lt;P&gt;**741**;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;proc sql;&lt;/P&gt;&lt;P&gt;create table selected_741A as&lt;/P&gt;&lt;P&gt;select a.*, b.*&lt;/P&gt;&lt;P&gt;from selected_741 a left join&amp;nbsp; huang_bal_04a b&lt;/P&gt;&lt;P&gt;on a.gvkey=b.gvkey and a.fyear=b.fYear&lt;/P&gt;&lt;P&gt;order by a.gvkey, a.fyear;&lt;/P&gt;&lt;P&gt;run;&lt;/P&gt;&lt;P&gt;**597**;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;data combined_741;&lt;/P&gt;&lt;P&gt;set huang_741a selected_741A;&lt;/P&gt;&lt;P&gt;run;&lt;/P&gt;&lt;P&gt;**741+741=1282**;&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Tue, 28 Oct 2014 07:44:28 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/Data-matching-only-one-firm-is-chosen-to-be-matched-against-the/m-p/144754#M38438</guid>
      <dc:creator>mei</dc:creator>
      <dc:date>2014-10-28T07:44:28Z</dc:date>
    </item>
    <item>
      <title>Re: Data matching- only one firm is chosen to be matched against the original dataset</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/Data-matching-only-one-firm-is-chosen-to-be-matched-against-the/m-p/144755#M38439</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;can u give some data sample and try to explain with help of that what you need.&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Tue, 28 Oct 2014 09:23:21 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/Data-matching-only-one-firm-is-chosen-to-be-matched-against-the/m-p/144755#M38439</guid>
      <dc:creator>LearnByMistk</dc:creator>
      <dc:date>2014-10-28T09:23:21Z</dc:date>
    </item>
    <item>
      <title>Re: Data matching- only one firm is chosen to be matched against the original dataset</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/Data-matching-only-one-firm-is-chosen-to-be-matched-against-the/m-p/144756#M38440</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;Hi,&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Well, in answer to your question "How do I avoid the same firm to be selected again?", that's a fairly simple response, look at your data, isolate why more than one record matches based on your join criteria, then evaluate a method to reduce that, maybe using distinct, or first date or something else.&amp;nbsp; So look at each step and identify why you get multiple records would be the first stage.&amp;nbsp; Providing example data, and narrowing down which procedure is causing the problem would be beneficial. &lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Tue, 28 Oct 2014 09:52:29 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/Data-matching-only-one-firm-is-chosen-to-be-matched-against-the/m-p/144756#M38440</guid>
      <dc:creator>RW9</dc:creator>
      <dc:date>2014-10-28T09:52:29Z</dc:date>
    </item>
    <item>
      <title>Re: Data matching- only one firm is chosen to be matched against the original dataset</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/Data-matching-only-one-firm-is-chosen-to-be-matched-against-the/m-p/144757#M38441</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P class="yiv4393242793" dir="ltr" id="yiv4393242793yui_3_16_0_1_1414483824472_5023" style="color: #000000; font-family: HelveticaNeue, 'Helvetica Neue', Helvetica, Arial, 'Lucida Grande', sans-serif; font-size: 12px;"&gt;I have attached two files: huang_741 and huang_bal_04a,&lt;/P&gt;&lt;P class="yiv4393242793" dir="ltr" id="yiv4393242793yui_3_16_0_1_1414483824472_5023" style="color: #000000; font-family: HelveticaNeue, 'Helvetica Neue', Helvetica, Arial, 'Lucida Grande', sans-serif; font-size: 12px;"&gt;&lt;/P&gt;&lt;P class="yiv4393242793" dir="ltr" id="yiv4393242793yui_3_16_0_1_1414483824472_5023" style="color: #000000; font-family: HelveticaNeue, 'Helvetica Neue', Helvetica, Arial, 'Lucida Grande', sans-serif; font-size: 12px;"&gt;I have now created 5 sets of matched pair for each firms using the codes as follows:&lt;/P&gt;&lt;P class="yiv4393242793" dir="ltr" id="yiv4393242793yui_3_16_0_1_1414483824472_5023" style="color: #000000; font-family: HelveticaNeue, 'Helvetica Neue', Helvetica, Arial, 'Lucida Grande', sans-serif; font-size: 12px;"&gt;&lt;/P&gt;&lt;P class="yiv4393242793" id="yiv4393242793yui_3_16_0_1_1414483824472_5023" style="color: #000000; font-family: HelveticaNeue, 'Helvetica Neue', Helvetica, Arial, 'Lucida Grande', sans-serif; font-size: 12px;"&gt;&lt;/P&gt;&lt;P class="yiv4393242793" id="yiv4393242793yui_3_16_0_1_1414483824472_5023" style="color: #000000; font-family: HelveticaNeue, 'Helvetica Neue', Helvetica, Arial, 'Lucida Grande', sans-serif; font-size: 12px;"&gt;proc means data=control_total741 noprint;&lt;/P&gt;&lt;P class="yiv4393242793" id="yiv4393242793yui_3_16_0_1_1414483824472_5023" style="color: #000000; font-family: HelveticaNeue, 'Helvetica Neue', Helvetica, Arial, 'Lucida Grande', sans-serif; font-size: 12px;"&gt;by Anum afy;&lt;/P&gt;&lt;P class="yiv4393242793" id="yiv4393242793yui_3_16_0_1_1414483824472_5023" style="color: #000000; font-family: HelveticaNeue, 'Helvetica Neue', Helvetica, Arial, 'Lucida Grande', sans-serif; font-size: 12px;"&gt;output out=selected_total741_5set idgroup(min(sizeDiff) out[5] (fyear gvkey)=);&lt;/P&gt;&lt;P class="yiv4393242793" id="yiv4393242793yui_3_16_0_1_1414483824472_5023" style="color: #000000; font-family: HelveticaNeue, 'Helvetica Neue', Helvetica, Arial, 'Lucida Grande', sans-serif; font-size: 12px;"&gt;run;&lt;/P&gt;&lt;P class="yiv4393242793" id="yiv4393242793yui_3_16_0_1_1414483824472_5023" style="color: #000000; font-family: HelveticaNeue, 'Helvetica Neue', Helvetica, Arial, 'Lucida Grande', sans-serif; font-size: 12px;"&gt;**741**;&lt;/P&gt;&lt;P id="yiv4393242793yui_3_16_0_1_1414483824472_5350" style="color: #000000; font-family: HelveticaNeue, 'Helvetica Neue', Helvetica, Arial, 'Lucida Grande', sans-serif; font-size: 12px;"&gt;&lt;/P&gt;&lt;P dir="ltr" id="yiv4393242793yui_3_16_0_1_1414483824472_5351" style="color: #000000; font-family: HelveticaNeue, 'Helvetica Neue', Helvetica, Arial, 'Lucida Grande', sans-serif; font-size: 12px;"&gt;I have attached the file: selected_741_5sets.&lt;/P&gt;&lt;P dir="ltr" id="yiv4393242793yui_3_16_0_1_1414483824472_5351" style="color: #000000; font-family: HelveticaNeue, 'Helvetica Neue', Helvetica, Arial, 'Lucida Grande', sans-serif; font-size: 12px;"&gt;&lt;/P&gt;&lt;P dir="ltr" id="yiv4393242793yui_3_16_0_1_1414483824472_5351" style="color: #000000; font-family: HelveticaNeue, 'Helvetica Neue', Helvetica, Arial, 'Lucida Grande', sans-serif; font-size: 12px;"&gt;I was thinking maybe it is good to look at this file. If 1 firms are selected three times, gvkey_1 will repeat for three times. then i should choose gvkey_2 and gvkey_3 for the second and third duplicates. Instead of going through manually (there are 71 duplicates), do you think is it a good way to write some codes to tell the program to choose correctly?&lt;/P&gt;&lt;P dir="ltr" id="yiv4393242793yui_3_16_0_1_1414483824472_5351" style="color: #000000; font-family: HelveticaNeue, 'Helvetica Neue', Helvetica, Arial, 'Lucida Grande', sans-serif; font-size: 12px;"&gt;&lt;/P&gt;&lt;P dir="ltr" style="color: #000000; font-family: HelveticaNeue, 'Helvetica Neue', Helvetica, Arial, 'Lucida Grande', sans-serif; font-size: 12px;"&gt;&lt;/P&gt;&lt;P dir="ltr" id="yiv4393242793yui_3_16_0_1_1414483824472_5351" style="color: #000000; font-family: HelveticaNeue, 'Helvetica Neue', Helvetica, Arial, 'Lucida Grande', sans-serif; font-size: 12px;"&gt;Thanks for your time.&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Tue, 28 Oct 2014 10:07:27 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/Data-matching-only-one-firm-is-chosen-to-be-matched-against-the/m-p/144757#M38441</guid>
      <dc:creator>mei</dc:creator>
      <dc:date>2014-10-28T10:07:27Z</dc:date>
    </item>
    <item>
      <title>Re: Data matching- only one firm is chosen to be matched against the original dataset</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/Data-matching-only-one-firm-is-chosen-to-be-matched-against-the/m-p/144758#M38442</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;It would be better to post a little sample data and the output you need . Explain it as few words as you could.&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Tue, 28 Oct 2014 11:31:56 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/Data-matching-only-one-firm-is-chosen-to-be-matched-against-the/m-p/144758#M38442</guid>
      <dc:creator>Ksharp</dc:creator>
      <dc:date>2014-10-28T11:31:56Z</dc:date>
    </item>
    <item>
      <title>Re: Data matching- only one firm is chosen to be matched against the original dataset</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/Data-matching-only-one-firm-is-chosen-to-be-matched-against-the/m-p/231183#M54560</link>
      <description>&lt;P&gt;It would be helpful if you also attached the huang_bal_04a dataset.&lt;/P&gt;</description>
      <pubDate>Thu, 22 Oct 2015 14:57:07 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/Data-matching-only-one-firm-is-chosen-to-be-matched-against-the/m-p/231183#M54560</guid>
      <dc:creator>drsurf</dc:creator>
      <dc:date>2015-10-22T14:57:07Z</dc:date>
    </item>
  </channel>
</rss>

