BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
sasnew_484
Fluorite | Level 6

Hi, 

I am little lost comparing two fields of numeric data type  which has different format. As in I have two columns

Rate1 as numeric with Length 8, Format 6.3 and Informat as Best10.

Rate2 as numeric with Length 8, Format Best11, and Informat Best11. 

For some compares, even if the two fields are similar in value it displaying as 'Yes' from my case statement. 

I see for Rate2 format there is leading white spaces for all the values when I look at sas enterprise guide output window? Any suggestions, how to convert both fields in one similar format so it gives me proper results ? Thanks 

Rate1     Rate2

11.174   11.174

 

proc sql;
select ID, rate1, rate2, 
case when rate1=rate2 then 'No' 
else 'Yes' 
END as Status 
From temp; 
Quit;

 

 

 

 

1 ACCEPTED SOLUTION

Accepted Solutions
ballardw
Super User

Lesson: FORMAT has NO effect on value comparisons.

What it may do is make a person think two values are different, or the same, when looking at the value when the format is applied. A 6.3 format rounds the displayed value to 3 decimals. So if the actual value is 11.17399999 or 11.17400003 it will display as 11.174. A Best format will find the "best" way to fit the value into the specified number of character positions. Run this and look at the result in the log:

data _null_;
   x=12345678;
   put "Best 11 format" x=best11.;
   put "Best  6 format" x=best6.;
   put "Best  3 format" x=best3.;
run;

All three values written are "equal" but displayed differently because of the number of characters allowed to display them.

"Leading white spaces" just means that the column sized picked by EG to display something is wider than the number of digits that being displayed.

 

Print the data set with a longer format assigned to both variables and see that the values are not equal perhaps:

proc print data=temp;
   where rate1 ne rate2;
   var rate1 rate2;
   format rate1 rate2 16.13;
run;

Rule: If you need a specific level of precision of your variables it is up to you to round or truncate to that level of precision.

 

You could also have included the difference in your result:

proc sql;
   select ID, rate1, rate2, (rate1-rate2) as dif format=best12. , 
   case when rate1=rate2 then 'No' 
   else 'Yes' 
   END as Status 
From temp; 
Quit; 

View solution in original post

6 REPLIES 6
ballardw
Super User

Lesson: FORMAT has NO effect on value comparisons.

What it may do is make a person think two values are different, or the same, when looking at the value when the format is applied. A 6.3 format rounds the displayed value to 3 decimals. So if the actual value is 11.17399999 or 11.17400003 it will display as 11.174. A Best format will find the "best" way to fit the value into the specified number of character positions. Run this and look at the result in the log:

data _null_;
   x=12345678;
   put "Best 11 format" x=best11.;
   put "Best  6 format" x=best6.;
   put "Best  3 format" x=best3.;
run;

All three values written are "equal" but displayed differently because of the number of characters allowed to display them.

"Leading white spaces" just means that the column sized picked by EG to display something is wider than the number of digits that being displayed.

 

Print the data set with a longer format assigned to both variables and see that the values are not equal perhaps:

proc print data=temp;
   where rate1 ne rate2;
   var rate1 rate2;
   format rate1 rate2 16.13;
run;

Rule: If you need a specific level of precision of your variables it is up to you to round or truncate to that level of precision.

 

You could also have included the difference in your result:

proc sql;
   select ID, rate1, rate2, (rate1-rate2) as dif format=best12. , 
   case when rate1=rate2 then 'No' 
   else 'Yes' 
   END as Status 
From temp; 
Quit; 
sasnew_484
Fluorite | Level 6

Thank you so much for such clear explanation. As you suggested I ran the difference between rate 1 and rate 2 and it seems like for some the diff is in 0.0000959, 0.000829 etc. So, my case statement is displaying as 'Yes' as internally SAS compiler thinks they are different. Even after I truncate rate1 and rate2 to 3 decimal places and I still seeing the difference even if it is 12.189 values for both rate1 and rate2. How can I resolve this ?

 

I used below for truncation, I have two user defined functions 

Libname FD Base "/path"

proc fcmp outlib = FD.functions.truncations;

function truncn(x,d);

p = 10**d;

y = int(x*p)/p;

return(y);

endsub;

function truncc(x,d);

y = put(x,best.);

y= substr(y,1, find(y, '.')+d);

return(y);

endsub;

run;

 

options cmplib=FD.functions;

data A

length x n 8 c $9;

input x;

n =truncn(x,3);

datalines;

3.14159

557.89763

;

sasnew_484
Fluorite | Level 6

Thanks for that tip, I can get the results I desired. What I did was calculating the diff value if it is greater than 0.0005 then the case statement result would be 'yes' if it is less than 0.0005 then 'No'. 

 

Man, I am so new to Sas, so much to learn. 

Tom
Super User Tom
Super User

Note that dealing with floating point numbers is not unique to SAS.

data want ;
  set have;
  similar = abs(rate1-rate2)<0.0005;
run;
ballardw
Super User

@sasnew_484 wrote:

Thanks for that tip, I can get the results I desired. What I did was calculating the diff value if it is greater than 0.0005 then the case statement result would be 'yes' if it is less than 0.0005 then 'No'. 

 

Man, I am so new to Sas, so much to learn. 


You might want to look in the documentation for the COMPFUZZ function as well. It allows you to set a fuzz value of how close you want things to be for treating as equal, i.e one of the parameters could be 0.00005 and values within that range of each other are considered "equal".

 

I didn't work through your trunc functions but suspect you have a similar numeric precision issue as decimal values and internal binary representations often run into some problem with the more decimal positions the more likely for a precision issue.

 

sasnew_484
Fluorite | Level 6

thank you for this information. Will keep this in my mind. 

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 6 replies
  • 2246 views
  • 3 likes
  • 3 in conversation