<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Option for truncating &amp;quot;long&amp;quot; decimal numbers. in SAS Programming</title>
    <link>https://communities.sas.com/t5/SAS-Programming/Option-for-truncating-quot-long-quot-decimal-numbers/m-p/692091#M210757</link>
    <description>&lt;P&gt;Hi again,&lt;/P&gt;
&lt;P&gt;I guess I'll go next way - if simple minus/except doesn't match - will do except for all character and numeric columns without decimal part, if that part of tables match - that means issue is in that decimal columns, at least something:)&lt;/P&gt;
&lt;P&gt;Set of columns will be possible extract using proc content/macro variables manipulations etc.:)&lt;/P&gt;
&lt;P&gt;Alternative way just instead of select * - do select &amp;amp;column_LIST&amp;nbsp; and form &amp;amp;column_LIST based on data type - if decimal - round, if no -leave as it is, but offcourse that will impact on perfopmance&lt;/P&gt;</description>
    <pubDate>Fri, 16 Oct 2020 10:22:45 GMT</pubDate>
    <dc:creator>Yura2301</dc:creator>
    <dc:date>2020-10-16T10:22:45Z</dc:date>
    <item>
      <title>Option for truncating "long" decimal numbers.</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Option-for-truncating-quot-long-quot-decimal-numbers/m-p/691886#M210661</link>
      <description>&lt;P&gt;Hi All,&lt;/P&gt;
&lt;P&gt;Didn't work for a long time with SAS, so would like to ask for advice regarding a tricky case.&lt;/P&gt;
&lt;P&gt;So let's assume table A was migrated from SAS data set to some external DB, 1 to 1 migration.&lt;/P&gt;
&lt;P&gt;After the migration, we'll need to compare if both tables are still the same, in the end should be simple answer "Yes" or "No", no need in detail report, for now.&lt;/P&gt;
&lt;P&gt;Obviously, for such a case no sense in proc compare or like that, enough simple code like&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;select count(*)&lt;/STRONG&gt;&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;from(&lt;/STRONG&gt;&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;&amp;nbsp;&amp;nbsp; select * from saslib.tableX&lt;/STRONG&gt;&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; except&lt;/STRONG&gt;&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;&amp;nbsp;&amp;nbsp; select * from DBLib.tableX&lt;/STRONG&gt;&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;) as a&lt;/STRONG&gt;&lt;/P&gt;
&lt;P&gt;Plus reversal check like DBLib.tableX minus saslib.tableX.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Such way works fine for the majority of the tables, a few tables had issues with encoding - expected, but other and main issue - is decimals like "0,000000123".&lt;/P&gt;
&lt;P&gt;So obviously each DB works with decimals in its own way, but what I would like to do(if possible) - set some global option at the beginning of SAS session that will say like:&lt;/P&gt;
&lt;P&gt;Ignore very small decimals&amp;nbsp; - like if we have 0,0019 and 0,002 -&amp;nbsp; treat them as the same values, so except(minus) of such values will result in 0 rows.&lt;/P&gt;
&lt;P&gt;Of course, we can do rounding for needed columns, etc., but generally - such comparison needs to be done for hundreds of tables(each has its own set of columns etc.), some of them are huge, so would like to rid - of any data manipulation before except(minus of those 2 tables).&lt;/P&gt;
&lt;P&gt;Making HASH based on all columns looks like also is not an option for the case - anyway, those problematic decimals will make hash value different.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;So long story short - would like to ask if there is some elegant way to force SAS to treat 0.0019 and 0,002 as the same values without any additional coding.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;I Will be very thankful for any advice:)!&lt;/P&gt;</description>
      <pubDate>Thu, 15 Oct 2020 16:41:40 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Option-for-truncating-quot-long-quot-decimal-numbers/m-p/691886#M210661</guid>
      <dc:creator>Yura2301</dc:creator>
      <dc:date>2020-10-15T16:41:40Z</dc:date>
    </item>
    <item>
      <title>Re: Option for truncating "long" decimal numbers.</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Option-for-truncating-quot-long-quot-decimal-numbers/m-p/691902#M210665</link>
      <description>&lt;P&gt;You may want to investigate the Compare procedure. There are a couple of options available related to small differences. One you can set is Criterion. The default comparison for numeric values in the procedure is 0.00001 but you could make that smaller or larger though it would affect all the numeric comparisons and not variable by variable.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;If you use Proc Compare you want to make sure the data is sorted in the same order.&lt;/P&gt;
&lt;P&gt;The output can consist of data sets with a variety of different constructs so read the documentation. I would guess that you would likely not be interested in the normal printed output but only a data set with differences which could be accomplished with the NOPRINT out=&amp;lt;somedatasetname&amp;gt; and OUTNOEQUAL options. The last option would have the output data set empty if everything matches.&lt;/P&gt;</description>
      <pubDate>Thu, 15 Oct 2020 16:50:52 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Option-for-truncating-quot-long-quot-decimal-numbers/m-p/691902#M210665</guid>
      <dc:creator>ballardw</dc:creator>
      <dc:date>2020-10-15T16:50:52Z</dc:date>
    </item>
    <item>
      <title>Re: Option for truncating "long" decimal numbers.</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Option-for-truncating-quot-long-quot-decimal-numbers/m-p/691907#M210667</link>
      <description>&lt;P&gt;Hi ballardw,&lt;BR /&gt;Thanks for the quick reply!&lt;BR /&gt;Will take a look to proc compare more attentively.&lt;BR /&gt;Some of the tables that need to be compared - are really huge, - hundreds of millions of rows, tens of Gigabytes etc. so I'm not sure if proc compare will handle those Monsters at all, but for smaller and average tables -it can be an option:)&lt;BR /&gt;I also remember that proc compare does a lot of additional work that is not needed for the case, that's why I initially decided not to use it, but as you mentioned - some of the features can be like "turned off", so maybe it will speed-up comparing.&lt;BR /&gt;&lt;BR /&gt;Thanks one more time!&lt;/P&gt;</description>
      <pubDate>Thu, 15 Oct 2020 16:59:53 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Option-for-truncating-quot-long-quot-decimal-numbers/m-p/691907#M210667</guid>
      <dc:creator>Yura2301</dc:creator>
      <dc:date>2020-10-15T16:59:53Z</dc:date>
    </item>
    <item>
      <title>Re: Option for truncating "long" decimal numbers.</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Option-for-truncating-quot-long-quot-decimal-numbers/m-p/691937#M210676</link>
      <description>&lt;P&gt;I've never had any problems using PROC COMPARE on large tables. It will slow down the more differences it finds but in your case there shouldn't be many if you configure it to ignore tiny differences.&lt;/P&gt;</description>
      <pubDate>Thu, 15 Oct 2020 19:01:47 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Option-for-truncating-quot-long-quot-decimal-numbers/m-p/691937#M210676</guid>
      <dc:creator>SASKiwi</dc:creator>
      <dc:date>2020-10-15T19:01:47Z</dc:date>
    </item>
    <item>
      <title>Re: Option for truncating "long" decimal numbers.</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Option-for-truncating-quot-long-quot-decimal-numbers/m-p/692028#M210721</link>
      <description>&lt;P&gt;Use options &lt;FONT face="courier new,courier"&gt;method=absolute criterion=0.001&lt;/FONT&gt;&amp;nbsp; for example.&lt;/P&gt;</description>
      <pubDate>Fri, 16 Oct 2020 01:29:22 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Option-for-truncating-quot-long-quot-decimal-numbers/m-p/692028#M210721</guid>
      <dc:creator>ChrisNZ</dc:creator>
      <dc:date>2020-10-16T01:29:22Z</dc:date>
    </item>
    <item>
      <title>Re: Option for truncating "long" decimal numbers.</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Option-for-truncating-quot-long-quot-decimal-numbers/m-p/692063#M210739</link>
      <description>&lt;P&gt;Hi SASKiwi,&lt;/P&gt;
&lt;P&gt;It seems like those data sets that need to be compared are not sorted the same way and we don't even have ID for the majority of the tables, for part of the tables PK can be extracted from descriptive portion, but not for all.&lt;/P&gt;
&lt;P&gt;So proc compare with NOSORTED option and without ID, even don't sure it's possible compare that way.&lt;/P&gt;</description>
      <pubDate>Fri, 16 Oct 2020 07:48:46 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Option-for-truncating-quot-long-quot-decimal-numbers/m-p/692063#M210739</guid>
      <dc:creator>Yura2301</dc:creator>
      <dc:date>2020-10-16T07:48:46Z</dc:date>
    </item>
    <item>
      <title>Re: Option for truncating "long" decimal numbers.</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Option-for-truncating-quot-long-quot-decimal-numbers/m-p/692064#M210740</link>
      <description>&lt;P&gt;Thanks!&lt;/P&gt;
&lt;P&gt;Will use the options&lt;/P&gt;</description>
      <pubDate>Fri, 16 Oct 2020 07:49:10 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Option-for-truncating-quot-long-quot-decimal-numbers/m-p/692064#M210740</guid>
      <dc:creator>Yura2301</dc:creator>
      <dc:date>2020-10-16T07:49:10Z</dc:date>
    </item>
    <item>
      <title>Re: Option for truncating "long" decimal numbers.</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Option-for-truncating-quot-long-quot-decimal-numbers/m-p/692065#M210741</link>
      <description>&lt;P&gt;Thanks everyone for advices!&lt;/P&gt;
&lt;P&gt;Tried to use proc compare like:&lt;BR /&gt;proc compare base=&lt;STRONG&gt;saslib.tableX&lt;/STRONG&gt; compare=&lt;STRONG&gt;DBLib.tableX&lt;/STRONG&gt; NOPRINT method=absolute criterion=0.001 outnoequal out=noeqCols;&lt;BR /&gt;ID _ALL_ NOTSORTED;&lt;BR /&gt;run;&lt;/P&gt;
&lt;P&gt;But seems like there is no way to use _ALL_ for the compare proc.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;The problem is that for half of the tables there is no PK in the descriptive portion so not possible to use the ID statement, additionally seems like that tables sorted in a different way, so if sort both tables before comparing&amp;nbsp; - it will be too heavy I guess, some of the tables are tens of Gigabytes.&lt;/P&gt;
&lt;P&gt;I planned for such huge tables test only first/last million of rows, but the issues with decimals remain the same.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Thanks!&lt;/P&gt;</description>
      <pubDate>Fri, 16 Oct 2020 08:30:44 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Option-for-truncating-quot-long-quot-decimal-numbers/m-p/692065#M210741</guid>
      <dc:creator>Yura2301</dc:creator>
      <dc:date>2020-10-16T08:30:44Z</dc:date>
    </item>
    <item>
      <title>Re: Option for truncating "long" decimal numbers.</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Option-for-truncating-quot-long-quot-decimal-numbers/m-p/692091#M210757</link>
      <description>&lt;P&gt;Hi again,&lt;/P&gt;
&lt;P&gt;I guess I'll go next way - if simple minus/except doesn't match - will do except for all character and numeric columns without decimal part, if that part of tables match - that means issue is in that decimal columns, at least something:)&lt;/P&gt;
&lt;P&gt;Set of columns will be possible extract using proc content/macro variables manipulations etc.:)&lt;/P&gt;
&lt;P&gt;Alternative way just instead of select * - do select &amp;amp;column_LIST&amp;nbsp; and form &amp;amp;column_LIST based on data type - if decimal - round, if no -leave as it is, but offcourse that will impact on perfopmance&lt;/P&gt;</description>
      <pubDate>Fri, 16 Oct 2020 10:22:45 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Option-for-truncating-quot-long-quot-decimal-numbers/m-p/692091#M210757</guid>
      <dc:creator>Yura2301</dc:creator>
      <dc:date>2020-10-16T10:22:45Z</dc:date>
    </item>
    <item>
      <title>Re: Option for truncating "long" decimal numbers.</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Option-for-truncating-quot-long-quot-decimal-numbers/m-p/692325#M210893</link>
      <description>&lt;P&gt;&lt;EM&gt;&amp;gt; But seems like there is no way to use _ALL_ for the compare proc.&lt;/EM&gt;&lt;/P&gt;
&lt;P&gt;The &lt;FONT face="courier new,courier"&gt;id&lt;/FONT&gt; statement defines they key. The keys are not compared, they are matched.&lt;/P&gt;
&lt;P&gt;&lt;FONT face="courier new,courier"&gt;id _ALL_&lt;/FONT&gt; therefore makes no sense for proc compare.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;EM&gt;&amp;gt; The problem is that for half of the tables there is no PK&lt;/EM&gt;&lt;/P&gt;
&lt;P&gt;Well you need to solve that. You should only compare observations that have a reason to match, shouldn't you?&lt;/P&gt;</description>
      <pubDate>Sat, 17 Oct 2020 23:19:05 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Option-for-truncating-quot-long-quot-decimal-numbers/m-p/692325#M210893</guid>
      <dc:creator>ChrisNZ</dc:creator>
      <dc:date>2020-10-17T23:19:05Z</dc:date>
    </item>
    <item>
      <title>Re: Option for truncating "long" decimal numbers.</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Option-for-truncating-quot-long-quot-decimal-numbers/m-p/692336#M210900</link>
      <description>&lt;P&gt;Hi again Chris,&lt;/P&gt;
&lt;P&gt;Thanks for the clarifications!&lt;/P&gt;
&lt;P&gt;Generally, you are right, as I can see now&amp;nbsp; - proc compare extremely efficient if use it in the right way - with ID columns and sorted data sets etc.&lt;/P&gt;
&lt;P&gt;But exactly in my case -&amp;nbsp; I know for sure that counts of tables match and they are sorted in diff. way and no defined Key etc.,&lt;/P&gt;
&lt;P&gt;Cases I would like to "catch" are like encoding issues during the migration, dates/decimals issue etc.&lt;/P&gt;
&lt;P&gt;What I already figured-out:&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;For small and AVG tables&amp;nbsp; - Except/Minus way works ok, it doesn't require defining PK for the table and so on. Interesting point - according to my tests except/minus works almost the same time as proc compare with ID - NoSort option.&lt;/LI&gt;
&lt;LI&gt;For huge tables(from 5GB till 100GB in my case) - neither proc compare with ID nor Except doesn't perform.&lt;/LI&gt;
&lt;/UL&gt;
&lt;P&gt;Of course it depends on SAS configuration(CPU etc.), but in an actual situation using actual resources - the best way I see is split huge tables into smaller parts and compare part by part.&lt;/P&gt;
&lt;P&gt;A lot of tables doesn't have PK as I wrote before, but all have at least 1 index, so my approximate plan is:&lt;/P&gt;
&lt;OL&gt;
&lt;LI&gt;Calculated on the fly for how many parts the table should be split so each part has around 1GB.&lt;/LI&gt;
&lt;LI&gt;Split table using most "distributed" column from the available index, usually it's some Surrogate Key, BIGInt.&lt;/LI&gt;
&lt;LI&gt;Compare each of that cut sub-tables in the loop(starting from the newest data), one by one.&lt;/LI&gt;
&lt;LI&gt;If there will be some difference - big probability it will be found on the first part, stop comparing after first founded diff.&lt;/LI&gt;
&lt;LI&gt;Logging on fly - so ech time it's visible that f.e. 40% of table X is processed already, no differences etc., so some trace file will be loaded before and after each step&lt;/LI&gt;
&lt;/OL&gt;
&lt;P&gt;Point two is the most interesting from tech side. Looks like if the table is indexed and even if it's huge&amp;nbsp; - the SAS code like&lt;/P&gt;
&lt;P&gt;select distinct colA from tab1&lt;/P&gt;
&lt;P&gt;Executes very quick if there is an index on colA, so having such "keys table" and having table size&amp;nbsp; - very easy split table on N approximately same size parts, so we'll know that part 1 subtable starts from colA value 10 and ends with colA value 12345 , part 2 starts with cola value 1234&lt;STRONG&gt;6 &lt;/STRONG&gt;and end with value ...&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Generally, it seems like it was much easier move data from SAS to other DB then compare if data matched after the move:)&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Sun, 18 Oct 2020 09:23:02 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Option-for-truncating-quot-long-quot-decimal-numbers/m-p/692336#M210900</guid>
      <dc:creator>Yura2301</dc:creator>
      <dc:date>2020-10-18T09:23:02Z</dc:date>
    </item>
    <item>
      <title>Re: Option for truncating "long" decimal numbers.</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Option-for-truncating-quot-long-quot-decimal-numbers/m-p/692399#M210933</link>
      <description>&lt;P&gt;Another solution: Compare just one compound field. Something like:&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data HAVE1 HAVE2;
  array A[50] $8;
  do I= 1 to 1e4;
    A50=put(I,8.); 
    output HAVE1;
    if I=1e3 then A50=put(I+1,8.); 
    output HAVE2;
  end;
  drop I;
run;

data COMP1/view=COMP1;
  length COMP $16;
  set HAVE1;
  COMP=md5(catx('|',of _ALL_));
run;

data COMP2/view=COMP2;
  length COMP $16;
  set HAVE2;
  COMP=md5(catx('|',of _ALL_));
run;

proc sql;
  select COMP from COMP1
  except 
  select COMP from COMP2;
quit; 
&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;If you find mismatches, you can re-run the views and filter using the offending values.&lt;/P&gt;
&lt;P&gt;You may have to tune how variable COMP is created to ensure that the wanted level of numeric precision is kept.&lt;/P&gt;
&lt;P&gt;Also, ensure the variables are concatenated in the same order for both tables.&lt;/P&gt;</description>
      <pubDate>Mon, 19 Oct 2020 03:32:51 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Option-for-truncating-quot-long-quot-decimal-numbers/m-p/692399#M210933</guid>
      <dc:creator>ChrisNZ</dc:creator>
      <dc:date>2020-10-19T03:32:51Z</dc:date>
    </item>
    <item>
      <title>Re: Option for truncating "long" decimal numbers.</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Option-for-truncating-quot-long-quot-decimal-numbers/m-p/692432#M210955</link>
      <description>&lt;P&gt;Hi again Chris,&lt;/P&gt;
&lt;P&gt;That's a really good idea, maybe for a few biggest tables more efficient will be split by columns and not by rows and compare like col by col(or subset of columns by subset of columns using md5 hash, as you proposed).&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Thanks so much for the advice!&lt;/P&gt;</description>
      <pubDate>Mon, 19 Oct 2020 06:35:34 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Option-for-truncating-quot-long-quot-decimal-numbers/m-p/692432#M210955</guid>
      <dc:creator>Yura2301</dc:creator>
      <dc:date>2020-10-19T06:35:34Z</dc:date>
    </item>
    <item>
      <title>Re: Option for truncating "long" decimal numbers.</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Option-for-truncating-quot-long-quot-decimal-numbers/m-p/692471#M210977</link>
      <description>You can compare all columns and all rows in one go if you hash all the fields as shown. And if you can easily sort the tables if needed on that one field.  &lt;BR /&gt;</description>
      <pubDate>Mon, 19 Oct 2020 10:26:05 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Option-for-truncating-quot-long-quot-decimal-numbers/m-p/692471#M210977</guid>
      <dc:creator>ChrisNZ</dc:creator>
      <dc:date>2020-10-19T10:26:05Z</dc:date>
    </item>
  </channel>
</rss>

