turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

Find a Community

- Home
- /
- SAS Programming
- /
- Base SAS Programming
- /
- Rounding Errors with PROC Tabulate

Topic Options

- RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

12-04-2017 12:41 PM - edited 12-04-2017 12:42 PM

How can I make proc tabulate more accurate when summing numbers? I am getting unacceptably-large rounding errors in SAS compared to doing a pivot table on the same data in excel.

I am using the following command to read the file:

data new_dataset;

infile "&input_loc.\&input_file." dsd missover dlm = "," truncover lrecl=32767 firstobs = 2;

input

.....

sqft :12.2

....

;

run;

In the raw excel file with 25,799 rows, every number has two decimals. I have tried reading it in several ways--with no informat, by saving it with quoted commas "2,500.00" and reading it with comma12.2 but it makes no difference. The numbers look the same as what's in excel as far as I can tell, but I am getting a difference of 5,179,191 in the total for proc tabulate, vs 5,179,186 in excel. I have tried adding formats to sqft as well as to the Tabulate statement.

How can I make it tabulate with higher precision? Or, is it possible Excel is wrong?

Accepted Solutions

Solution

12-04-2017
01:15 PM

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to shailey

12-04-2017 01:07 PM

Also, show the proc tabulate code.

You should also examine your cells in Excel to see if you are seeing a rounded value (typical) and the exact formula used for the total in question in Excel.

If you read data with an informat the values may well be rounded as you tell SAS not to read the additional decimals that may actually be present. A larger sum in SAS might indicate that you have negative values that are getting truncated by the informat.

Another possibility, since you are showing an infile statement and reading with a data step is the conversion process from Excel to the text file read truncated values.

All Replies

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to shailey

12-04-2017 12:51 PM

You're asking about PROC TABULATE but posting code about an INPUT step.

Have you verified that the data is being read in correctly? You have the same number of rows, variables and that the sum to the same amounts?

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Reeza

12-04-2017 01:04 PM

The proc tabulate is just a very standard

proc tabulate data=new_dataset format=comma12.2 MISSING;

var sqft;

table sqft * sum;

run;

The number of rows and columns are identical, and this particular tabulate statement doesn't have any class variables but some of the others do, and those subtotals are erratically off by 1 or 2. It completely looks like rounding errors.

Solution

12-04-2017
01:15 PM

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to shailey

12-04-2017 01:07 PM

Also, show the proc tabulate code.

You should also examine your cells in Excel to see if you are seeing a rounded value (typical) and the exact formula used for the total in question in Excel.

If you read data with an informat the values may well be rounded as you tell SAS not to read the additional decimals that may actually be present. A larger sum in SAS might indicate that you have negative values that are getting truncated by the informat.

Another possibility, since you are showing an infile statement and reading with a data step is the conversion process from Excel to the text file read truncated values.

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to ballardw

12-04-2017 01:21 PM

Thank you! You were correct that there were some errant 3-digit numbers that were being hidden by Excel formatting. Originally I was not reading them in with an informat but just raw, but when I was saving the .xlsx file as a .csv file (to read in to SAS), Excel stupidly was only exporting the displayed value.

Once I changed the formatting of that column in the excel spreadsheet to 6 decimal places, saved it as a csv, and then read it in with the informat of 12.6, the total sum came out to the correct number of 5,179,186.

I didn't think about Excel's formatting!

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to shailey

12-04-2017 02:28 PM

Just a heads up that Excel is also bad at automatically rounding values and doesn't tell you, especially if it's coming from a CSV file and the numbers are large. Check what happens if you do a 16 digit number.

You'll have the same issue in many systems (SAS) included so it's worth being aware that it can occur