<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Comparison of 2 tables in SAS Programming</title>
    <link>https://communities.sas.com/t5/SAS-Programming/Comparison-of-2-tables/m-p/575435#M162769</link>
    <description>PROC COMPARE - test it on subsets first to ensure you can get the output you need.</description>
    <pubDate>Mon, 22 Jul 2019 16:08:28 GMT</pubDate>
    <dc:creator>Reeza</dc:creator>
    <dc:date>2019-07-22T16:08:28Z</dc:date>
    <item>
      <title>Comparison of 2 tables</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Comparison-of-2-tables/m-p/575420#M162761</link>
      <description>&lt;P&gt;Hello, I have the following issue: I have 2 tables that contain just the same data, but calculated by different algorithms (i.e. 2 persons are watching for cars: identity no, colour, speed) having one key (identity no from the example) and i have to find the lines which have any differences in observations.&lt;BR /&gt;&lt;BR /&gt;The issue's that the tables are 300 colomns x 1m lines.&lt;BR /&gt;&lt;BR /&gt;The straightforward algorithm is:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;/*Preparing sample tables*/
data person1;
input identity, colour $50., speed;
datalines;
1 red 10
2 blue 15
3 green 30
1 red 15&lt;BR /&gt;;&lt;BR /&gt;run;

data person2;
input identity, colour $50., speed;
datalines;
1 red 10
2 white 15
3 green 2
1 red 15&lt;BR /&gt;;&lt;BR /&gt;run;

/*I suppose algorithm to show observervations with identies 2 and 3 are mismatching (2 - colour, 3 - speed).*/

proc sql;
create table join as
select t1.*, t2.colour as colour_2, t2.speed as speed_2 from
(select * from person1) t1
full join
(select * from person2) t2
on t1.identity = t2.identity;
quit;

/*Data step is used to add reason field*/
data mismatch;
format reason $50.;
set join;
if colour ne colour_2 then do; reson ='colour'; output; end;
if speed ne speed_2 then do; reson ='speed'; output; end;
run;&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;The issue is - I have 300 columns, so "t2.column as column_2" section in proc sql and "if column ne column_2" in data step are quite hude - is there any more simple solution?&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;And i have to full join two 300x1m tables, is there any way-around solution?&lt;/P&gt;</description>
      <pubDate>Mon, 22 Jul 2019 15:58:30 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Comparison-of-2-tables/m-p/575420#M162761</guid>
      <dc:creator>i_Van</dc:creator>
      <dc:date>2019-07-22T15:58:30Z</dc:date>
    </item>
    <item>
      <title>Re: Comparison of 2 tables</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Comparison-of-2-tables/m-p/575435#M162769</link>
      <description>PROC COMPARE - test it on subsets first to ensure you can get the output you need.</description>
      <pubDate>Mon, 22 Jul 2019 16:08:28 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Comparison-of-2-tables/m-p/575435#M162769</guid>
      <dc:creator>Reeza</dc:creator>
      <dc:date>2019-07-22T16:08:28Z</dc:date>
    </item>
    <item>
      <title>Re: Comparison of 2 tables</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Comparison-of-2-tables/m-p/575720#M162912</link>
      <description>&lt;P&gt;Thank you greatly, but PROC COMPARE is not the very thing I need (or more likely I'm not good at reading help):&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;proc compare base=person1 compare=person2 out=compare; 
id identity; 
run;&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;would give me the table containing not only differing values but both equal values, and having 300 columns it would be hard to find the differing "XXXXX" for chars and differing value for numeric.&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="2019-07-23 13_03_43-SAS Enterprise Guide.png" style="width: 492px;"&gt;&lt;img src="https://communities.sas.com/t5/image/serverpage/image-id/31219iA71C4D0BEC7C5267/image-size/large?v=v2&amp;amp;px=999" role="button" title="2019-07-23 13_03_43-SAS Enterprise Guide.png" alt="2019-07-23 13_03_43-SAS Enterprise Guide.png" /&gt;&lt;/span&gt;&lt;/P&gt;&lt;P&gt;More over the values that present only in one table would not be indicated in the table - only the overall mismatching amount in output like&lt;BR /&gt;&lt;BR /&gt;&lt;/P&gt;&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;Number of Observations in Common: 4.
Number of Observations in WORK.B but not in WORK.A: 1.
Total Number of Observations Read from WORK.A: 4.
Total Number of Observations Read from WORK.B: 5. &lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;while proc sql&lt;/P&gt;&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data mismatch;
format reason $50.;
set join;
reason = '';
if colour ne colour_2 or speed ne speed_2;
if colour ne colour_2 then  reason = cat('colour',' ', reason); 
if speed ne speed_2 then reason =cat('speed',' ', reason); 
run;&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;would filter out only mismatching lines, highlighting mismatching variables and showing lines that exist only in one table:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="2019-07-23 13_04_07-SAS Enterprise Guide.png" style="width: 600px;"&gt;&lt;img src="https://communities.sas.com/t5/image/serverpage/image-id/31220i528CA10CF1309EEB/image-size/large?v=v2&amp;amp;px=999" role="button" title="2019-07-23 13_04_07-SAS Enterprise Guide.png" alt="2019-07-23 13_04_07-SAS Enterprise Guide.png" /&gt;&lt;/span&gt;&lt;/P&gt;</description>
      <pubDate>Tue, 23 Jul 2019 10:07:32 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Comparison-of-2-tables/m-p/575720#M162912</guid>
      <dc:creator>i_Van</dc:creator>
      <dc:date>2019-07-23T10:07:32Z</dc:date>
    </item>
    <item>
      <title>Re: Comparison of 2 tables</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Comparison-of-2-tables/m-p/575829#M162935</link>
      <description>I thought it did show where lines do differ? Did you check the options available to control the output on the PROC COMPARE statement?&lt;BR /&gt;&lt;BR /&gt;It's most definitely not perfect, but I think in your case, it does do what you asked.</description>
      <pubDate>Tue, 23 Jul 2019 15:00:40 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Comparison-of-2-tables/m-p/575829#M162935</guid>
      <dc:creator>Reeza</dc:creator>
      <dc:date>2019-07-23T15:00:40Z</dc:date>
    </item>
  </channel>
</rss>

