Hi, I had two data sets and encountered an issue that the value in a data set extracted from Snowflake are slightly different to the original value stored in SAS data set. See the example:
DATA_SF DATA_Ori
COST_AMT COST_AMT Diff. %Diff
44.48 44.48 7.11E-15 1.60E-14
5.1 5.1 8.88E-16 1.74E-14
My questions: What could be the reason for this different? Suggestion?
Thank you very much!
Hi @leehsin,
I would make sure that the values are correct if they are processed later on. Otherwise these small rounding errors (7.11E-15 etc.) can cause unwanted issues in all sorts of comparisons, including IF or WHERE conditions, assignment of user-defined format categories, merging and sorting. Similar errors can arise from calculations, even within SAS.
Example:
data data_ori;
input cost_amt;
cards;
44.48
5.1
;
data data_sf;
set data_ori;
cost_amt=cost_amt*449/449;
run;
proc compare data=data_sf c=data_ori;
run;
Result (using Windows SAS 9.4M5):
|| Base Compare Obs || cost_amt cost_amt Diff. % Diff ________ || _________ _________ _________ _________ || 1 || 44.4800 44.4800 7.105E-15 1.597E-14 2 || 5.1000 5.1000 8.882E-16 1.742E-14
In both cases the calculation (seemingly a multiplication by 1) has reduced the result by one unit of the least significant bit in the internal binary floating-point representation, namely 2**-47=7.105...E-15 in the case of 44.48 and 2**-50=8.8817...E-16 for 5.1. You can notice the errors in the HEX16. format:
proc print data=data_sf;
format cost_amt hex16.;
run;
Result:
Obs cost_amt 1 40463D70A3D70A3C 2 4014666666666665
Even without seeing the corresponding representations of the original values (in data_ori) it's fairly obvious that the correct last hex digits should be D and 6, respectively (in view of the repeating digit patterns). So, look at a few of your values in HEX16. format to find out where the errors have occurred.
Besides calculations, "moving [data] across application and/or architecture borders" is a common way of introducing this type of rounding (or numeric representation) errors, as mentioned by Kurt_Bremser. I would follow ballardw's advice and correct the errors (in SAS) by applying the ROUND function with an appropriate rounding unit (which depends a bit on your data).
Example:
data want;
set data_sf;
cost_amt=round(cost_amt,1e-7);
run;
Numeric precision - basically computers aren't perfect and can't track decimals exactly after a certain point.
https://documentation.sas.com/doc/en/pgmsascdc/9.4_3.5/lepg/p0dv87zb3bnse6n1mqo360be70qr.htm
Is a difference of 0.0000000000000711 of significance to you?
@leehsin wrote:
Hi, I had two data sets and encountered an issue that the value in a data set extracted from Snowflake are slightly different to the original value stored in SAS data set. See the example:
DATA_SF DATA_Ori
COST_AMT COST_AMT Diff. %Diff
44.48 44.48 7.11E-15 1.60E-14
5.1 5.1 8.88E-16 1.74E-14
My questions: What could be the reason for this different? Suggestion?
Thank you very much!
You will often find such differences when moving across application and/or architecture borders. We found differences between SAS on the mainframe and SAS on UNIX/Windows simply because the MF uses more bits in the mantissa.
So, once you verified how the differences originate and that they are statistically insignificant (which you have done), live with them.
SAS stores all numbers in 64 bits (8 bytes) floating point. This leads to precision limitations (see the linked article), and if your other data source uses another storage format, there will be differences.
E-15 range of differences likely points to precision of different machines as very likely issue.
If you are actually only using two decimal places then round or truncate the data to the required precision you use.
Hi @leehsin,
I would make sure that the values are correct if they are processed later on. Otherwise these small rounding errors (7.11E-15 etc.) can cause unwanted issues in all sorts of comparisons, including IF or WHERE conditions, assignment of user-defined format categories, merging and sorting. Similar errors can arise from calculations, even within SAS.
Example:
data data_ori;
input cost_amt;
cards;
44.48
5.1
;
data data_sf;
set data_ori;
cost_amt=cost_amt*449/449;
run;
proc compare data=data_sf c=data_ori;
run;
Result (using Windows SAS 9.4M5):
|| Base Compare Obs || cost_amt cost_amt Diff. % Diff ________ || _________ _________ _________ _________ || 1 || 44.4800 44.4800 7.105E-15 1.597E-14 2 || 5.1000 5.1000 8.882E-16 1.742E-14
In both cases the calculation (seemingly a multiplication by 1) has reduced the result by one unit of the least significant bit in the internal binary floating-point representation, namely 2**-47=7.105...E-15 in the case of 44.48 and 2**-50=8.8817...E-16 for 5.1. You can notice the errors in the HEX16. format:
proc print data=data_sf;
format cost_amt hex16.;
run;
Result:
Obs cost_amt 1 40463D70A3D70A3C 2 4014666666666665
Even without seeing the corresponding representations of the original values (in data_ori) it's fairly obvious that the correct last hex digits should be D and 6, respectively (in view of the repeating digit patterns). So, look at a few of your values in HEX16. format to find out where the errors have occurred.
Besides calculations, "moving [data] across application and/or architecture borders" is a common way of introducing this type of rounding (or numeric representation) errors, as mentioned by Kurt_Bremser. I would follow ballardw's advice and correct the errors (in SAS) by applying the ROUND function with an appropriate rounding unit (which depends a bit on your data).
Example:
data want;
set data_sf;
cost_amt=round(cost_amt,1e-7);
run;
Try options METHOD= and FUZZ= :
data data_ori;
input cost_amt;
cards;
44.48
5.1
;
data data_sf;
set data_ori;
cost_amt=cost_amt*449/449;
run;
proc compare data=data_sf c=data_ori method=relative fuzz=1E-10;
run;
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.