DATA Step, Macro, Functions and more

Why the values of numeric variable in two data sets containing equal values are not matching

Reply
Super Contributor
Posts: 323

Why the values of numeric variable in two data sets containing equal values are not matching

Dear,

 

I ran proc compare on two datasets. The numeric variable values of two datasets  are exactly same but not matching in proc compare. Any suggestions. Please help. Thanks.

The format and informat are same in both datasets.

 I attached a proc compare report. 

 

data base

AVAL

102.64285714

 

data compare

AVAL

102.64285714

 

 

Super User
Posts: 23,776

Re: Why the values of numeric variable in two data sets containing equal values are not matching

Posted in reply to knveraraju91

Look at the FUZZ option in PROC COMPARE

Super User
Posts: 10,280

Re: Why the values of numeric variable in two data sets containing equal values are not matching

Posted in reply to knveraraju91

Visibly equal data may not be technically equal. The way numbers are stored in SAS may create imprecisions far down from the decimal point that are not displayed when using typical number formats, but are detected by every comparison.

You can see this when compare displays the difference as1E-14 or something similar.

Either use the fuzz factor in compare (as @Reeza already suggested), or make use of the round() function when doing calculations that involve fractions.

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
How to post code
Super User
Posts: 13,583

Re: Why the values of numeric variable in two data sets containing equal values are not matching

Posted in reply to knveraraju91

And the practical affect of a difference of

0.000000000000013840

is going to be what on your process?

 

 

Super User
Super User
Posts: 8,127

Re: Why the values of numeric variable in two data sets containing equal values are not matching

Posted in reply to knveraraju91

That is just one of the side effects of using binary computers to perform mathematicl operations. 

Since there are many numbers that cannot be stored exactly in the IEEE floating point format you need to take that into consideration.

 

Here is a simple example that uses the fact the one tenth cannot be stored exactly.

 

data _null_;
  one=1 ;
  do _n_=1 to 10 ;
     another_one + (1/10) ;
  end;
  diff = one - another_one ;
  put (_all_) (= best32. /  );
run;
one=1
another_one=1
diff=1.1102230246251E-16
Ask a Question
Discussion stats
  • 4 replies
  • 148 views
  • 1 like
  • 5 in conversation