<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Compare observations in multiple datasets in SAS Data Science</title>
    <link>https://communities.sas.com/t5/SAS-Data-Science/Compare-observations-in-multiple-datasets/m-p/316821#M9663</link>
    <description>&lt;P&gt;Hello,&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I have several datasets I need to stack, but in some the field is labeled Gender and either has 'M' or 'F' in the observations and in other datasets the field is Gender_DESC and 'Male' or 'Female' in the observations. I need to make sure the field name and values are consistent. Is there an easy way to compare the datasets and see what's inside the observations so I know what needs to be fixed?&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Let me know if you have any questions.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Thanks&lt;/P&gt;</description>
    <pubDate>Mon, 05 Dec 2016 19:46:29 GMT</pubDate>
    <dc:creator>UIC_SUG</dc:creator>
    <dc:date>2016-12-05T19:46:29Z</dc:date>
    <item>
      <title>Compare observations in multiple datasets</title>
      <link>https://communities.sas.com/t5/SAS-Data-Science/Compare-observations-in-multiple-datasets/m-p/316821#M9663</link>
      <description>&lt;P&gt;Hello,&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I have several datasets I need to stack, but in some the field is labeled Gender and either has 'M' or 'F' in the observations and in other datasets the field is Gender_DESC and 'Male' or 'Female' in the observations. I need to make sure the field name and values are consistent. Is there an easy way to compare the datasets and see what's inside the observations so I know what needs to be fixed?&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Let me know if you have any questions.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Thanks&lt;/P&gt;</description>
      <pubDate>Mon, 05 Dec 2016 19:46:29 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Data-Science/Compare-observations-in-multiple-datasets/m-p/316821#M9663</guid>
      <dc:creator>UIC_SUG</dc:creator>
      <dc:date>2016-12-05T19:46:29Z</dc:date>
    </item>
    <item>
      <title>Re: Compare observations in multiple datasets</title>
      <link>https://communities.sas.com/t5/SAS-Data-Science/Compare-observations-in-multiple-datasets/m-p/316842#M9664</link>
      <description>&lt;P&gt;One possibility would be to take every data set that contains GENDER_DESC and create GENDER.&amp;nbsp; That's relatively simple:&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;length gender $ 1;&lt;/P&gt;
&lt;P&gt;gender = gender_desc;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Then you could use the same process you use now to see if the GENDER values are equal.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;There is a weakness to this approach.&amp;nbsp; By copying just the first character into GENDER, you may create the impression that all is OK when it is not.&amp;nbsp; For example:&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;length gender $ 1;&lt;/P&gt;
&lt;P&gt;gender_desc = 'Molly';&lt;/P&gt;
&lt;P&gt;gender = gender_desc;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;GENDER will be "M" but there is no way to tell that GENDER_DESC is bad without examining in more detail.&amp;nbsp; Those sorts of comparisons are possible but are more complex to set up so the right solution may depend upon your being comfortable with more complex code.&lt;/P&gt;</description>
      <pubDate>Mon, 05 Dec 2016 20:41:43 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Data-Science/Compare-observations-in-multiple-datasets/m-p/316842#M9664</guid>
      <dc:creator>Astounding</dc:creator>
      <dc:date>2016-12-05T20:41:43Z</dc:date>
    </item>
  </channel>
</rss>

