Help using Base SAS procedures

proc sql sum is inaccurate

Reply
Contributor
Posts: 28

proc sql sum is inaccurate

I have data like this:

Pat_ID     Discharge     Admit          Paid_Amount

A                                   2/1/2012     -20.48

A                                   2/1/2012     -10.18

A                                   2/1/2012     -1300.44

A                                   2/1/2012      1300.44

A                                   2/1/2012     10.18

A                                   2/1/2012     20.48

A               1/1/2012      2/1/2012     50.50

A               1/1/2012      2/1/2012     60.48

I need to sum up the paid_amounts by Pat_ID, discharge and admit so I used this:

proc sql;

create table sumPaid as

select pat_id, admit, discharge, sum(paid_amount) as paid_amount

from pat_det

by pat_id, admit, discharge;

quit;

Which gives me  something like this:

pat_ID     Admit     Discharge     paid_amount

A          2/1/2012                4.54747474E-13

A          2/1/2012  1/1/2012         110.98

You can tell that the observations with blank discharge but admit 2/1/2012 have equal paid_amounts that should add up to 0 yet for similar situations, I get extremely small E-13 sums and I have no idea why that could be.

Respected Advisor
Posts: 3,124

Re: proc sql sum is inaccurate

One possibility is that you are tricked by the format.  See below:

data pat_det;

     input Pat_ID :$10.    (Discharge     Admit) (:mmddyy10.)          Paid_Amount /*default format is best12.*/;

     cards;

A                      .             2/1/2012     -20.48

A .                                  2/1/2012     -10.18

A .                                  2/1/2012     -1300.44

A . 2/1/2012      1300.4400000001

A   .                                2/1/2012     10.18

A    .                               2/1/2012     20.48

A              1/1/2012      2/1/2012     50.50

A              1/1/2012      2/1/2012     60.48

;

Capture02051.PNG

proc sql;

     create table sumPaid as

           select pat_id, admit, discharge, sum(paid_amount) as paid_amount

                from pat_det

                     group by pat_id, admit, discharge;

quit;

Capture0205.PNG

Respected Advisor
Posts: 3,124

Re: proc sql sum is inaccurate

One way to reconcile is to use ROUND(), before SUM.

Super User
Posts: 10,538

Re: proc sql sum is inaccurate

Actually I think this might be a case of round After sum as  the first example would like to be 0 but I think the binary representation of negative and positive is part of the issue.

Respected Advisor
Posts: 3,124

Re: proc sql sum is inaccurate

Good point. However, I was worry about the small discrepancies may add up big if not using ROUND() at early stage.

Super User
Posts: 10,538

Re: proc sql sum is inaccurate

I don't have to deal with negative numbers in my work often. This example tells me that if you're dealing with such then you may need to consider rounding at both ends. I did a quick trial with the data as shown by the OP and created a paid amount rounded to 0.01. The sums for the group that should total 0 were both -8.17E14 (different systems different precision likely).

Contributor
Posts: 28

Re: proc sql sum is inaccurate


After investigating, it seems that the issue is caused by sql which stores numbers as floating numbers and supposedly, I have to specify the sum as numeric(8,2) with a cast function.  I'm not sure how to do that, so instead of using proc sql I used proc summary and the issue went away.

Super User
Super User
Posts: 6,502

Re: proc sql sum is inaccurate

Good that you got the answer you needed, but your explanation is off.

PROC SUMMARY worked because by default it applies the format from the source variable to the derived statistic. So if you had 8.2 format attached to your PAID_AMOUNT variable then the SUM statistic would also be displayed using the 8.2 format.  In SQL syntax you just need to add the FORMAT option to your derived variable to see the same results.

select pat_id, admit, discharge, sum(paid_amount) as paid_amount format=8.2

SAS stores numbers as floating point.  There is no way to represent decimal values (like dollars and cents) exact using binary floating point numbers. Rounding in advance won't really help. Rounding after summing can help a little.  If you really want to work with dollars and cents then you might consider storing the values as # of CENTS instead of # of DOLLARS because then they will be store as integers and the inability of binary floating point format to exactly represent decimal fractions will not apply.

Contributor
Posts: 28

Re: proc sql sum is inaccurate

Thanks for clarifying - my non explanation was just what I thought I gathered from trying to find a solution.  I don't know much if anything about how SAS stores numbers so this helps!

Community Manager
Posts: 2,768

Re: proc sql sum is inaccurate

An exhaustive explanation about floating point numbers in SAS (and in software in general):

http://blogs.sas.com/content/sasdummy/2012/03/01/precision-in-sas-numbers/

There you will learn more than you ever wanted to know!

Chris

Ask a Question
Discussion stats
  • 9 replies
  • 1054 views
  • 3 likes
  • 5 in conversation