DATA Step, Macro, Functions and more

Proc Compare- Why two same integer values are different?

Reply
New Contributor
Posts: 2

Proc Compare- Why two same integer values are different?

 

Hi everyone, I got a question about proc compare. Why two same integer values  are seen different in my compared  datasets ?

 

_______________________________________________________________

                          ||    Base             Compare                          

  Obs                  ||    Tab1            Tab2              Diff.                 % Diff      

____________  ||  _________     _________  _________       _________    

                          ||                                                 

        710779      ||         4761        4761             -9.09E-13     -1.91E-14   

 

 

Thanks in advance. 

Super User
Super User
Posts: 9,842

Re: Proc Compare- Why two same integer values are different?

This has been covered many times.  SAS stores numbers in 8 bits.  I can't remember the exact maths about it, but sometimes a number gets stored which has a very small tiny fraction part due to calculations, so your raw number maybe:

4761

vs

4761.000000000000001

 

Utilise the round, int, floor etc. functions on it - in general in any calculation I round or otherwise fix the result to make sure.  You can use the fuzz option:

https://communities.sas.com/t5/SAS-Procedures/Proc-compare-showing-matching-values-even-after-using-...

 

But in general I would recommend fixing the number.

New Contributor
Posts: 2

Re: Proc Compare- Why two same integer values are different?

Thank you @RW9 @ballardw @SASKiwi ! It really helped me. 

Super User
Posts: 4,026

Re: Proc Compare- Why two same integer values are different?

As @RW9 says this comes up all the time and there are many reference to the issue of numeric precision in the documentation including this:

http://documentation.sas.com/?docsetId=hostwin&docsetTarget=n04ccixfia6l2pn1f8szvttqg3hm.htm&docsetV...

 

Because SAS stores numerics in 8 bytes of memory / disk it can only accurately represent any number for the first 15 digits (reading left to right). PROC COMPARE is quite accurately reporting differences beyond the first 15 digits. You can control what PROC COMPARE reports as "numerically equal" by tweaking the CRITERION = and METHOD = options:

http://documentation.sas.com/?docsetId=proc&docsetTarget=n0c1y14wyd3u7yn1dmfcpaejllsn.htm&docsetVers...

Super User
Posts: 13,941

Re: Proc Compare- Why two same integer values are different?

[ Edited ]

One or both of those values in not an integer. The Proc compare output shows values using a default format that apparently does not want to show 12 decimal places for them.

 

If you think that variable should be an integer round or truncate it before use/compare.

 

If you don't think that a difference like 0.000000000909 will not cause problems for further use you could also run your proc compare code with the FUZZ= numericvalue option to ignore very small results.

Example fuzz= 1E-8 would treat any absolute difference lest than 0.00000001 as "good enough" and not report as a difference in the comparison.

Ask a Question
Discussion stats
  • 4 replies
  • 135 views
  • 6 likes
  • 4 in conversation