Format issue, SAS giving difference of 2 identicalnumbers as -2.22045E-16

Accepted Solution Solved
Reply
Occasional Contributor
Posts: 8
Accepted Solution

Format issue, SAS giving difference of 2 identicalnumbers as -2.22045E-16

Hi All,

I am trying to cut off a set of observations, depending on whether
they are LE median value or GT median value..
I have rounded my incoming numbers to 2 decimal places using round
(var,0.01) and the median value is calculated from this dataset..But when i take this median from proc univariate and pass it to the
dataset as a macro var and try to flag them appropriately, the record
with exact same value as the median is going into the GTMedian
bucket..

As an alternative I tried to round the median off to 1 decimal before feeding to macro var, it did not work either..

So I tried to take  the difference between each observation and the median..and the diff for the record in question was -2.22045E-16..
I tried many methods but the issue still exists...
Is this some format/informat issue?

Thank you,

raisins25


Accepted Solutions
Solution
‎06-27-2012 08:48 PM
PROC Star
Posts: 1,322

Re: Format issue, SAS giving difference of 2 identicalnumbers as -2.22045E-16

Posted in reply to raisins25

Still feels like a numeric precision problem to me.  While numeric vars with length <8 can make the problem more likely, it can still happen when length=8.  Below is an example showing that the mean of (.1,.1,.1) is slighly greater than .1. This feels like the same problem you are running into.  The link to Rick Wicklin's blog has some suggestions for introducing a fuzz factor if you need to do these sorts of comparisons.

928  data a;
929  x=.1;
930  output;
931  output;
932  output;
933  run;

NOTE: The data set WORK.A has 3 observations and 1 variables.

934
935  proc means data=a mean noprint ;
936    var x;
937    output out=b mean=mean;
938  run;

NOTE: There were 3 observations read from the data set WORK.A.
NOTE: The data set WORK.B has 1 observations and 3 variables.

939
940  data c;
941    if _n_=1 then set b (keep=mean);
942    set a (keep=x);
943    if x<mean then type='low ';
944    else if x>mean then type='high';
945    else if x=mean then type='mean';
946    dif=x-mean;
947    put (x mean type dif) (=);
948  run;

x=0.1 mean=0.1 type=low dif=-1.38778E-17
x=0.1 mean=0.1 type=low dif=-1.38778E-17
x=0.1 mean=0.1 type=low dif=-1.38778E-17
NOTE: There were 1 observations read from the data set WORK.B.
NOTE: There were 3 observations read from the data set WORK.A.
NOTE: The data set WORK.C has 3 observations and 4 variables.


View solution in original post


All Replies
Super User
Posts: 11,343

Re: Format issue, SAS giving difference of 2 identicalnumbers as -2.22045E-16

Posted in reply to raisins25

How are you creating the macro variable? Since macro variables are basically text there are some othe issues to consider.

Also how are flagging them? Some code may provide hints as to what correction you need.

PROC Star
Posts: 1,322

Re: Format issue, SAS giving difference of 2 identicalnumbers as -2.22045E-16

Posted in reply to raisins25

Yes, a small amout of sample data would help people help you.

In addition to macro vars being text, there often can be these sort of precision issues even in data set variables, due to numeric precision issues (computers can't represent all non-integers exactly, so you can end up with something that looks odd, like below)

96   data _null_;
97     if .1+.1+.1=.3 then put "decimal math works.";
98     else put "uh-oh, numeric precision problem!!";
99   run;

uh-oh, numeric precision problem!!

Fore more background, see e.g. :

http://blogs.sas.com/content/iml/2012/06/25/programming-tip-avoid-testing-floating-point-values-for-...

Occasional Contributor
Posts: 8

Re: Format issue, SAS giving difference of 2 identicalnumbers as -2.22045E-16

Posted in reply to raisins25

Thanks for your input...
Some additional info:


data vol;
set crf.stxvol;
where visit in (2,5);
format _all_;
informat _all_;
tot = round(sum(STIJVL3,STIJVL4),0.01);
if tot gt 0;
keep pt cpevent tot  ;
run;

proc sort data = vol; by pt cpevent;run;

data vol;
set vol;
by pt cpevent;
retain l;
if first.pt then l = tot;
if first.pt eq last.pt then fin =tot;
else if last.pt then fin = tot + l;
if last.pt;
keep pt fin;
run;

proc sort data = vol; by pt fin;run;

proc means data =vol n mean median noprint;
var fin;
output out =y n=n mean =mean median =median1 ;
run;

data y;
set y;
median =round(median1,0.1);
keep median;
run;

proc sql noprint;
select median into: median
from y;
quit;

%let median = %left(%trim(&median));
%put &median.;
**create a flag var for the median as cutoff variable;

data vol;
set vol;
length flag $80;
if fin le &median then flag = "LEmedianML";
else  if fin gt &median then flag ="GTmedianML";
if flag ne '';
keep pt flag;
run;

This one pt who has same fin value same as median has the GTmedianML flag Smiley Sad

that is when I tried to create a diff variable to see the difference between the fin var and median and for this patients i am getting the diff as 2.22045E-16

Super User
Super User
Posts: 7,060

Re: Format issue, SAS giving difference of 2 identicalnumbers as -2.22045E-16

Posted in reply to raisins25

You probably either have variables with length < 8 in your permanent dataset.

Try this little program.

%let diff=2.22045E-16;

data test;

  length x y 8 z 4;

  do i=1 to 2 by .1 ;

   x=i;

   y=x+&diff;

   z=x+&diff;

   output;

  end;

run;

data test2;

  set test;

  if y ne z;

run;

Then change the length of Z to 8 and run it again.

Occasional Contributor
Posts: 8

Re: Format issue, SAS giving difference of 2 identicalnumbers as -2.22045E-16

Thanks Tom..

I tried with assigning length to 8 to the tot variable, which pulls data from the perm dataset.. It did not work..

R

Super User
Posts: 5,513

Re: Format issue, SAS giving difference of 2 identicalnumbers as -2.22045E-16

Posted in reply to raisins25

raisins25,

Most likely, it is SQL that is rounding your median value.  It uses an 8-character format for translating from numeric to a character string.

Forget the SQL, forget the rounding, and forget using a macro variable.  Just bring the median into your final data step:

data vol;

set vol;

if _n_=1 then set y (keep=median1);

length ...

Good luck.


Occasional Contributor
Posts: 8

Re: Format issue, SAS giving difference of 2 identicalnumbers as -2.22045E-16

Posted in reply to Astounding

Thanks Astounding..

I have already tried this method, trying to by pass the macro var just in case that was the problem.. Didn't work either..

R

Solution
‎06-27-2012 08:48 PM
PROC Star
Posts: 1,322

Re: Format issue, SAS giving difference of 2 identicalnumbers as -2.22045E-16

Posted in reply to raisins25

Still feels like a numeric precision problem to me.  While numeric vars with length <8 can make the problem more likely, it can still happen when length=8.  Below is an example showing that the mean of (.1,.1,.1) is slighly greater than .1. This feels like the same problem you are running into.  The link to Rick Wicklin's blog has some suggestions for introducing a fuzz factor if you need to do these sorts of comparisons.

928  data a;
929  x=.1;
930  output;
931  output;
932  output;
933  run;

NOTE: The data set WORK.A has 3 observations and 1 variables.

934
935  proc means data=a mean noprint ;
936    var x;
937    output out=b mean=mean;
938  run;

NOTE: There were 3 observations read from the data set WORK.A.
NOTE: The data set WORK.B has 1 observations and 3 variables.

939
940  data c;
941    if _n_=1 then set b (keep=mean);
942    set a (keep=x);
943    if x<mean then type='low ';
944    else if x>mean then type='high';
945    else if x=mean then type='mean';
946    dif=x-mean;
947    put (x mean type dif) (=);
948  run;

x=0.1 mean=0.1 type=low dif=-1.38778E-17
x=0.1 mean=0.1 type=low dif=-1.38778E-17
x=0.1 mean=0.1 type=low dif=-1.38778E-17
NOTE: There were 1 observations read from the data set WORK.B.
NOTE: There were 3 observations read from the data set WORK.A.
NOTE: The data set WORK.C has 3 observations and 4 variables.


Occasional Contributor
Posts: 8

Re: Format issue, SAS giving difference of 2 identicalnumbers as -2.22045E-16

Quentin,

You are right! I introduced the fuzz factor method and it worked!

This is what i did

I also added the medina as a var than as a macro var..

data vol_;

      set vol_;

      length flag $80;

      eps = constant("SQRTMACEPS");

      value =median + eps;

      if fin lt value  then flag = "LEmedianML";

      else   flag ="GTmedianML";

      keep pt flag;

run;

Thank you all for your inputs!!

🔒 This topic is solved and locked.

Need further help from the community? Please ask a new question.

Discussion stats
  • 9 replies
  • 447 views
  • 6 likes
  • 5 in conversation