Hi,
I am little lost comparing two fields of numeric data type which has different format. As in I have two columns
Rate1 as numeric with Length 8, Format 6.3 and Informat as Best10.
Rate2 as numeric with Length 8, Format Best11, and Informat Best11.
For some compares, even if the two fields are similar in value it displaying as 'Yes' from my case statement.
I see for Rate2 format there is leading white spaces for all the values when I look at sas enterprise guide output window? Any suggestions, how to convert both fields in one similar format so it gives me proper results ? Thanks
Rate1 Rate2
11.174 11.174
proc sql;
select ID, rate1, rate2,
case when rate1=rate2 then 'No'
else 'Yes'
END as Status
From temp;
Quit;
Lesson: FORMAT has NO effect on value comparisons.
What it may do is make a person think two values are different, or the same, when looking at the value when the format is applied. A 6.3 format rounds the displayed value to 3 decimals. So if the actual value is 11.17399999 or 11.17400003 it will display as 11.174. A Best format will find the "best" way to fit the value into the specified number of character positions. Run this and look at the result in the log:
data _null_; x=12345678; put "Best 11 format" x=best11.; put "Best 6 format" x=best6.; put "Best 3 format" x=best3.; run;
All three values written are "equal" but displayed differently because of the number of characters allowed to display them.
"Leading white spaces" just means that the column sized picked by EG to display something is wider than the number of digits that being displayed.
Print the data set with a longer format assigned to both variables and see that the values are not equal perhaps:
proc print data=temp; where rate1 ne rate2; var rate1 rate2; format rate1 rate2 16.13; run;
Rule: If you need a specific level of precision of your variables it is up to you to round or truncate to that level of precision.
You could also have included the difference in your result:
proc sql; select ID, rate1, rate2, (rate1-rate2) as dif format=best12. , case when rate1=rate2 then 'No' else 'Yes' END as Status From temp; Quit;
Lesson: FORMAT has NO effect on value comparisons.
What it may do is make a person think two values are different, or the same, when looking at the value when the format is applied. A 6.3 format rounds the displayed value to 3 decimals. So if the actual value is 11.17399999 or 11.17400003 it will display as 11.174. A Best format will find the "best" way to fit the value into the specified number of character positions. Run this and look at the result in the log:
data _null_; x=12345678; put "Best 11 format" x=best11.; put "Best 6 format" x=best6.; put "Best 3 format" x=best3.; run;
All three values written are "equal" but displayed differently because of the number of characters allowed to display them.
"Leading white spaces" just means that the column sized picked by EG to display something is wider than the number of digits that being displayed.
Print the data set with a longer format assigned to both variables and see that the values are not equal perhaps:
proc print data=temp; where rate1 ne rate2; var rate1 rate2; format rate1 rate2 16.13; run;
Rule: If you need a specific level of precision of your variables it is up to you to round or truncate to that level of precision.
You could also have included the difference in your result:
proc sql; select ID, rate1, rate2, (rate1-rate2) as dif format=best12. , case when rate1=rate2 then 'No' else 'Yes' END as Status From temp; Quit;
Thank you so much for such clear explanation. As you suggested I ran the difference between rate 1 and rate 2 and it seems like for some the diff is in 0.0000959, 0.000829 etc. So, my case statement is displaying as 'Yes' as internally SAS compiler thinks they are different. Even after I truncate rate1 and rate2 to 3 decimal places and I still seeing the difference even if it is 12.189 values for both rate1 and rate2. How can I resolve this ?
I used below for truncation, I have two user defined functions
Libname FD Base "/path"
proc fcmp outlib = FD.functions.truncations;
function truncn(x,d);
p = 10**d;
y = int(x*p)/p;
return(y);
endsub;
function truncc(x,d);
y = put(x,best.);
y= substr(y,1, find(y, '.')+d);
return(y);
endsub;
run;
options cmplib=FD.functions;
data A
length x n 8 c $9;
input x;
n =truncn(x,3);
datalines;
3.14159
557.89763
;
Thanks for that tip, I can get the results I desired. What I did was calculating the diff value if it is greater than 0.0005 then the case statement result would be 'yes' if it is less than 0.0005 then 'No'.
Man, I am so new to Sas, so much to learn.
Note that dealing with floating point numbers is not unique to SAS.
data want ;
set have;
similar = abs(rate1-rate2)<0.0005;
run;
@sasnew_484 wrote:
Thanks for that tip, I can get the results I desired. What I did was calculating the diff value if it is greater than 0.0005 then the case statement result would be 'yes' if it is less than 0.0005 then 'No'.
Man, I am so new to Sas, so much to learn.
You might want to look in the documentation for the COMPFUZZ function as well. It allows you to set a fuzz value of how close you want things to be for treating as equal, i.e one of the parameters could be 0.00005 and values within that range of each other are considered "equal".
I didn't work through your trunc functions but suspect you have a similar numeric precision issue as decimal values and internal binary representations often run into some problem with the more decimal positions the more likely for a precision issue.
thank you for this information. Will keep this in my mind.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.