<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Merge two data sets based on more than one variable in SAS Programming</title>
    <link>https://communities.sas.com/t5/SAS-Programming/Merge-two-data-sets-based-on-more-than-one-variable/m-p/816650#M322359</link>
    <description>&lt;P&gt;Hi,&lt;/P&gt;&lt;P&gt;I have two data sets that I need to merge Set1 and Set2. Matching variables that can be used are MRN Full_Name DOB and specimen_Date. The issue that I have is that set1 have about one third MRNs (complete in set2) missing. Set 2 have more than 1/3 missing Full_name (complete in set2).&lt;/P&gt;&lt;P&gt;More than one person (obs) may have the same DOB and/or specimen_Date within each data set. So I'm leaning on using MRN and Full_Name but I'm not sure how to address the issue of missing data!&lt;/P&gt;&lt;P&gt;I'm wonderin if this works:&lt;/P&gt;&lt;P&gt;data merged;&lt;/P&gt;&lt;P&gt;merge&amp;nbsp; set1&lt;/P&gt;&lt;P&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; set2&amp;nbsp; (in = in2);&lt;/P&gt;&lt;P&gt;by mrn full_name;&lt;/P&gt;&lt;P&gt;if in2;&lt;/P&gt;&lt;P&gt;run;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Thank you&lt;/P&gt;</description>
    <pubDate>Mon, 06 Jun 2022 14:48:44 GMT</pubDate>
    <dc:creator>mayasak</dc:creator>
    <dc:date>2022-06-06T14:48:44Z</dc:date>
    <item>
      <title>Merge two data sets based on more than one variable</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Merge-two-data-sets-based-on-more-than-one-variable/m-p/816650#M322359</link>
      <description>&lt;P&gt;Hi,&lt;/P&gt;&lt;P&gt;I have two data sets that I need to merge Set1 and Set2. Matching variables that can be used are MRN Full_Name DOB and specimen_Date. The issue that I have is that set1 have about one third MRNs (complete in set2) missing. Set 2 have more than 1/3 missing Full_name (complete in set2).&lt;/P&gt;&lt;P&gt;More than one person (obs) may have the same DOB and/or specimen_Date within each data set. So I'm leaning on using MRN and Full_Name but I'm not sure how to address the issue of missing data!&lt;/P&gt;&lt;P&gt;I'm wonderin if this works:&lt;/P&gt;&lt;P&gt;data merged;&lt;/P&gt;&lt;P&gt;merge&amp;nbsp; set1&lt;/P&gt;&lt;P&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; set2&amp;nbsp; (in = in2);&lt;/P&gt;&lt;P&gt;by mrn full_name;&lt;/P&gt;&lt;P&gt;if in2;&lt;/P&gt;&lt;P&gt;run;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Thank you&lt;/P&gt;</description>
      <pubDate>Mon, 06 Jun 2022 14:48:44 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Merge-two-data-sets-based-on-more-than-one-variable/m-p/816650#M322359</guid>
      <dc:creator>mayasak</dc:creator>
      <dc:date>2022-06-06T14:48:44Z</dc:date>
    </item>
    <item>
      <title>Re: Merge two data sets based on more than one variable</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Merge-two-data-sets-based-on-more-than-one-variable/m-p/816657#M322364</link>
      <description>&lt;P&gt;First suggestion: Try it an see the result.&lt;/P&gt;
&lt;P&gt;Second suggestion: make the sets small enough that you can check the results fairly quickly. If you have 1,000s of observations in both you might miss the needed behaviors.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;The rules for how missing data are treated are best demonstrated by creating limited data sets, with missing in some of the NON-by variables and seeing the result. Which data is missing?&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;DOB and things like "specimen_date" in general are crappy matching variables unless there are other identification variables. Specimen_date in general I would say is likely not actually very useful for matching as it is extremely likely that the same person has multiple specimen_date values unless the specific purpose of the match is "test result" to specimen. I know that when I was going through some serious medical issues that "specimen_date" for some measures was sometimes 4 times per day and I don't want to image what the results of attempting to match that data with other things would result in.&lt;/P&gt;</description>
      <pubDate>Mon, 06 Jun 2022 15:34:10 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Merge-two-data-sets-based-on-more-than-one-variable/m-p/816657#M322364</guid>
      <dc:creator>ballardw</dc:creator>
      <dc:date>2022-06-06T15:34:10Z</dc:date>
    </item>
  </channel>
</rss>

