turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

Find a Community

- Home
- /
- SAS Programming
- /
- General Programming
- /
- How to check variables for equality, within tolera...

Topic Options

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

11-13-2015 01:38 PM

I want to compare two variables and return 'true' if they are equal within a specified tolerance, otherwsie 'false', and return 'false' if one of them is missing but 'true' if they are both missing. Is there an easy way to do this in a data step?

x | y | f(x,y) |

0.5000000 | 0.5000001 | TRUE |

0.5000000 | . | FALSE |

. | . | TRUE |

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

11-13-2015 01:51 PM - edited 11-13-2015 01:54 PM

Use Round() :

```
data test;
input x y;
format x y best12.;
datalines;
0.5000000 0.5000001
0.5000000 .
. .
;
%let tolerance=1e-6;
data want;
set test;
f = round(x,&tolerance.) = round(y,&tolerance.);
run;
proc print data=want noobs; run;
```

PG

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

11-13-2015 01:54 PM

Try this:

```
data have;
infile cards missover;
input x y;
format x 10.7 y 10.7;
comp_flag=ifc(round(x,0.000001)=round(y,0.000001),'Ture','Fail');
cards;
0.5000000 0.5000001
0.5000000 .
. .
;
proc print;
run;
```

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

11-13-2015 02:53 PM

I would advise you to stay away from the ROUND function on each of the values. It will not always give you the result you want. For example:

tolerance = 0.01;

x = 2.004;

y = 1.993;

If you round each variable separately, you'll get 2 - 1.99 which is TRUE. But the difference between them is 0.011 which is FALSE. Instead, take the absolute value of the difference and compare that to your tolerance level. It's a little clumsy, but you could try:

length result $ 5;

if x = y = . then result = 'TRUE';

else if x=. then result = 'FALSE';

else if y=. then result = 'FALSE';

else do;

if abs(x - y) <= tolerance then result='TRUE';

else result='FALSE';

end;

There may be slicker ways to code this, but stay away from ROUND.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

11-13-2015 04:05 PM

Astounding wrote:

I would advise you to stay away from the ROUND function on each of the values. It will not always give you the result you want. For example:

tolerance = 0.01;

x = 2.004;

y = 1.993;

If you round each variable separately, you'll get 2 - 1.99 which is TRUE. But the difference between them is 0.011 which is FALSE. Instead, take the absolute value of the difference and compare that to your tolerance level. It's a little clumsy, but you could try:

length result $ 5;

if x = y = . then result = 'TRUE';

else if x=. then result = 'FALSE';

else if y=. then result = 'FALSE';

else do;

if abs(x - y) <= tolerance then result='TRUE';

else result='FALSE';

end;

There may be slicker ways to code this, but stay away from ROUND.

Nice thing abou this is if you have a changing tolerance it is easier to use by supplying the tolerance value as part of the incoming data.

I might suggest a 1/0 coding instead of "True"/"False" as if you take the mean of the 1/0 then you get a percentage of in tolerance or sum a count of within tolerance not to mention use in some modeling approaches.

With either approach I might suggest that x=. or y=. is missing instead of coded as "False" as someone could mistake that with the result of an actual comparison between two values.