<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Proc Compare Large dataset with many columns in New SAS User</title>
    <link>https://communities.sas.com/t5/New-SAS-User/Proc-Compare-Large-dataset-with-many-columns/m-p/577410#M13199</link>
    <description>&lt;P&gt;Yes, I'm outputting it to a data set.&amp;nbsp; The problem is because there is over 400,000 records and 350+ columns, I don't know the best way to find the differences.&amp;nbsp; How do I easily identify, by the Primary key, which columns have the differences without scrolling thru 400,000 rows and 350 columns.&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
    <pubDate>Mon, 29 Jul 2019 15:38:21 GMT</pubDate>
    <dc:creator>Binxie</dc:creator>
    <dc:date>2019-07-29T15:38:21Z</dc:date>
    <item>
      <title>Proc Compare Large dataset with many columns</title>
      <link>https://communities.sas.com/t5/New-SAS-User/Proc-Compare-Large-dataset-with-many-columns/m-p/577364#M13194</link>
      <description>&lt;P&gt;Hello&lt;/P&gt;&lt;P&gt;I have a two extra large data sets (&amp;gt;9Million records) that I'm am trying to compare.&amp;nbsp; The data sets have over 300 columns in them.&lt;/P&gt;&lt;P&gt;I was able to run a proc compare that outputs the differences only by ID, however, because of the size of the data set and the restrictions on exporting at my company, I cannot get the data out to analyze on a record by record basis which columns are not matching.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Is there an easy way in SAS that I can only show the variables (by the ID key) that are different?&amp;nbsp; Any other suggestions on how I can view only the differences in the individual columns?&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Thanks&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Mon, 29 Jul 2019 14:21:30 GMT</pubDate>
      <guid>https://communities.sas.com/t5/New-SAS-User/Proc-Compare-Large-dataset-with-many-columns/m-p/577364#M13194</guid>
      <dc:creator>Binxie</dc:creator>
      <dc:date>2019-07-29T14:21:30Z</dc:date>
    </item>
    <item>
      <title>Re: Proc Compare Large dataset with many columns</title>
      <link>https://communities.sas.com/t5/New-SAS-User/Proc-Compare-Large-dataset-with-many-columns/m-p/577394#M13196</link>
      <description>Why can't you get the output? Have you tried suppressing the printed output and generating a data set to view instead?</description>
      <pubDate>Mon, 29 Jul 2019 15:09:02 GMT</pubDate>
      <guid>https://communities.sas.com/t5/New-SAS-User/Proc-Compare-Large-dataset-with-many-columns/m-p/577394#M13196</guid>
      <dc:creator>Reeza</dc:creator>
      <dc:date>2019-07-29T15:09:02Z</dc:date>
    </item>
    <item>
      <title>Re: Proc Compare Large dataset with many columns</title>
      <link>https://communities.sas.com/t5/New-SAS-User/Proc-Compare-Large-dataset-with-many-columns/m-p/577406#M13197</link>
      <description>&lt;P&gt;I can view the output fine.&amp;nbsp; I just need the ability to identify the differences easily.&amp;nbsp; Currently there are over 400,000 rows and in the 350+ columns there could be differences. I'm just trying to identify, by the ID which records in which columns have differences.&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;If I could export, it would be easier to analyze, but I can't so I'm trying to figure out a way to just output by ID the column where the dif=XXX.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Mon, 29 Jul 2019 15:27:32 GMT</pubDate>
      <guid>https://communities.sas.com/t5/New-SAS-User/Proc-Compare-Large-dataset-with-many-columns/m-p/577406#M13197</guid>
      <dc:creator>Binxie</dc:creator>
      <dc:date>2019-07-29T15:27:32Z</dc:date>
    </item>
    <item>
      <title>Re: Proc Compare Large dataset with many columns</title>
      <link>https://communities.sas.com/t5/New-SAS-User/Proc-Compare-Large-dataset-with-many-columns/m-p/577408#M13198</link>
      <description>Are you outputting your results to a data set using the OUT option? If not, that's likely what you want to do, since you can then filter that as desired. The OUT options are on the PROC COMPARE statement. If you don't know how, please show your current code and we can show you how to modify it.</description>
      <pubDate>Mon, 29 Jul 2019 15:31:35 GMT</pubDate>
      <guid>https://communities.sas.com/t5/New-SAS-User/Proc-Compare-Large-dataset-with-many-columns/m-p/577408#M13198</guid>
      <dc:creator>Reeza</dc:creator>
      <dc:date>2019-07-29T15:31:35Z</dc:date>
    </item>
    <item>
      <title>Re: Proc Compare Large dataset with many columns</title>
      <link>https://communities.sas.com/t5/New-SAS-User/Proc-Compare-Large-dataset-with-many-columns/m-p/577410#M13199</link>
      <description>&lt;P&gt;Yes, I'm outputting it to a data set.&amp;nbsp; The problem is because there is over 400,000 records and 350+ columns, I don't know the best way to find the differences.&amp;nbsp; How do I easily identify, by the Primary key, which columns have the differences without scrolling thru 400,000 rows and 350 columns.&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Mon, 29 Jul 2019 15:38:21 GMT</pubDate>
      <guid>https://communities.sas.com/t5/New-SAS-User/Proc-Compare-Large-dataset-with-many-columns/m-p/577410#M13199</guid>
      <dc:creator>Binxie</dc:creator>
      <dc:date>2019-07-29T15:38:21Z</dc:date>
    </item>
    <item>
      <title>Re: Proc Compare Large dataset with many columns</title>
      <link>https://communities.sas.com/t5/New-SAS-User/Proc-Compare-Large-dataset-with-many-columns/m-p/577412#M13201</link>
      <description>Have you tried using the OUTDIF or OUTNOEQUAL options to limit the output to records with differences?</description>
      <pubDate>Mon, 29 Jul 2019 15:40:42 GMT</pubDate>
      <guid>https://communities.sas.com/t5/New-SAS-User/Proc-Compare-Large-dataset-with-many-columns/m-p/577412#M13201</guid>
      <dc:creator>Reeza</dc:creator>
      <dc:date>2019-07-29T15:40:42Z</dc:date>
    </item>
    <item>
      <title>Re: Proc Compare Large dataset with many columns</title>
      <link>https://communities.sas.com/t5/New-SAS-User/Proc-Compare-Large-dataset-with-many-columns/m-p/577413#M13202</link>
      <description>Did you use the METHOD/FUZZ option so that minute  differences are suppressed?</description>
      <pubDate>Mon, 29 Jul 2019 15:41:11 GMT</pubDate>
      <guid>https://communities.sas.com/t5/New-SAS-User/Proc-Compare-Large-dataset-with-many-columns/m-p/577413#M13202</guid>
      <dc:creator>Reeza</dc:creator>
      <dc:date>2019-07-29T15:41:11Z</dc:date>
    </item>
    <item>
      <title>Re: Proc Compare Large dataset with many columns</title>
      <link>https://communities.sas.com/t5/New-SAS-User/Proc-Compare-Large-dataset-with-many-columns/m-p/577414#M13203</link>
      <description>&lt;P&gt;These are actual differences.&amp;nbsp; The two data sets need to be identical.&amp;nbsp; It's a new view that was created from an old one, and we are testing that the landed data is matching as the business expects it to.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Mon, 29 Jul 2019 15:45:18 GMT</pubDate>
      <guid>https://communities.sas.com/t5/New-SAS-User/Proc-Compare-Large-dataset-with-many-columns/m-p/577414#M13203</guid>
      <dc:creator>Binxie</dc:creator>
      <dc:date>2019-07-29T15:45:18Z</dc:date>
    </item>
    <item>
      <title>Re: Proc Compare Large dataset with many columns</title>
      <link>https://communities.sas.com/t5/New-SAS-User/Proc-Compare-Large-dataset-with-many-columns/m-p/577416#M13204</link>
      <description>&lt;P&gt;Yes, I have outdif and outnoequal in my query.&amp;nbsp; There are actually 400,000 difs in the data.&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;FONT color="#000080" face="Courier New" size="3"&gt;&lt;STRONG&gt;proc&lt;/STRONG&gt;&lt;/FONT&gt; &lt;STRONG&gt;&lt;FONT color="#000080" face="Courier New" size="3"&gt;compare&lt;/FONT&gt;&lt;/STRONG&gt; &lt;FONT color="#0000ff" face="Courier New" size="3"&gt;base&lt;/FONT&gt;&lt;FONT face="Courier New" size="3"&gt;=recs_a &lt;/FONT&gt;&lt;FONT color="#0000ff" face="Courier New" size="3"&gt;compare&lt;/FONT&gt;&lt;FONT face="Courier New" size="3"&gt;=recs_b&lt;/FONT&gt;&lt;/P&gt;&lt;P&gt;&lt;FONT color="#0000ff" face="Courier New" size="3"&gt;out&lt;/FONT&gt;&lt;FONT face="Courier New" size="3"&gt;=result &lt;/FONT&gt;&lt;FONT color="#0000ff" face="Courier New" size="3"&gt;outnoequal&lt;/FONT&gt; &lt;FONT color="#0000ff" face="Courier New" size="3"&gt;outbase&lt;/FONT&gt; &lt;FONT color="#0000ff" face="Courier New" size="3"&gt;outcomp&lt;/FONT&gt; &lt;FONT color="#0000ff" face="Courier New" size="3"&gt;outdif&lt;/FONT&gt;&lt;/P&gt;&lt;P&gt;&lt;FONT color="#0000ff" face="Courier New" size="3"&gt;noprint&lt;/FONT&gt;&lt;FONT face="Courier New" size="3"&gt;;&lt;/FONT&gt;&lt;/P&gt;&lt;P&gt;&lt;FONT color="#0000ff" face="Courier New" size="3"&gt;id&lt;/FONT&gt;&lt;FONT face="Courier New" size="3"&gt; POLICY_NUMBER TERM_NUMBER POLICY_RISK_IDENTIFIER EFFECTIVE_FROM EFFECTIVE_TO;&lt;/FONT&gt;&lt;/P&gt;&lt;P&gt;&lt;FONT color="#000080" face="Courier New" size="3"&gt;&lt;STRONG&gt;run&lt;/STRONG&gt;&lt;/FONT&gt;&lt;FONT face="Courier New" size="3"&gt;;&lt;/FONT&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Mon, 29 Jul 2019 15:56:55 GMT</pubDate>
      <guid>https://communities.sas.com/t5/New-SAS-User/Proc-Compare-Large-dataset-with-many-columns/m-p/577416#M13204</guid>
      <dc:creator>Binxie</dc:creator>
      <dc:date>2019-07-29T15:56:55Z</dc:date>
    </item>
    <item>
      <title>Re: Proc Compare Large dataset with many columns</title>
      <link>https://communities.sas.com/t5/New-SAS-User/Proc-Compare-Large-dataset-with-many-columns/m-p/577417#M13205</link>
      <description>Try just having the outnoequal, I think outnoequal and outdif are the opposite of each other.</description>
      <pubDate>Mon, 29 Jul 2019 15:58:03 GMT</pubDate>
      <guid>https://communities.sas.com/t5/New-SAS-User/Proc-Compare-Large-dataset-with-many-columns/m-p/577417#M13205</guid>
      <dc:creator>Reeza</dc:creator>
      <dc:date>2019-07-29T15:58:03Z</dc:date>
    </item>
  </channel>
</rss>

