BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
shailey
Obsidian | Level 7

How can I make proc tabulate more accurate when summing numbers? I am getting unacceptably-large rounding errors in SAS compared to doing a pivot table on the same data in excel.

 

I am using the following command to read the file:

 

data new_dataset; 
     infile "&input_loc.\&input_file." dsd missover dlm = "," truncover lrecl=32767 firstobs = 2;
     input

     .....
     sqft :12.2

     ....

     ; 

run; 

 

In the raw excel file with 25,799 rows, every number has two decimals. I have tried reading it in several ways--with no informat, by saving it with quoted commas "2,500.00" and reading it with comma12.2 but it makes no difference. The numbers look the same as what's in excel as far as I can tell, but I am getting a difference of 5,179,191 in the total for proc tabulate, vs 5,179,186 in excel. I have tried adding formats to sqft as well as to the Tabulate statement. 

 

How can I make it tabulate with higher precision? Or, is it possible Excel is wrong? 

1 ACCEPTED SOLUTION

Accepted Solutions
ballardw
Super User

Also, show the proc tabulate code.

You should also examine your cells in Excel to see if you are seeing a rounded value (typical) and the exact formula used for the total in question in Excel.

 

If you read data with an informat the values may well be rounded as you tell SAS not to read the additional decimals that may actually be present. A larger sum in SAS might indicate that you have negative values that are getting truncated by the informat.

 

Another possibility, since you are showing an infile statement and reading with a data step is the conversion process from Excel to the text file read truncated values.

View solution in original post

5 REPLIES 5
Reeza
Super User

You're asking about PROC TABULATE but posting code about an INPUT step. 

 

Have you verified that the data is being read in correctly? You have the same number of rows, variables and that the sum to the same amounts?

shailey
Obsidian | Level 7

The proc tabulate is just a very standard 

 

proc tabulate data=new_dataset format=comma12.2 MISSING;
     var sqft;
     table sqft * sum; 
run; 

 

The number of rows and columns are identical, and this particular tabulate statement doesn't have any class variables but some of the others do, and those subtotals are erratically off by 1 or 2. It completely looks like rounding errors.

 

 

ballardw
Super User

Also, show the proc tabulate code.

You should also examine your cells in Excel to see if you are seeing a rounded value (typical) and the exact formula used for the total in question in Excel.

 

If you read data with an informat the values may well be rounded as you tell SAS not to read the additional decimals that may actually be present. A larger sum in SAS might indicate that you have negative values that are getting truncated by the informat.

 

Another possibility, since you are showing an infile statement and reading with a data step is the conversion process from Excel to the text file read truncated values.

shailey
Obsidian | Level 7

Thank you! You were correct that there were some errant 3-digit numbers that were being hidden by Excel formatting. Originally I was not reading them in with an informat but just raw, but when I was saving the .xlsx file as a .csv file (to read in to SAS), Excel stupidly was only exporting the displayed value. 

 

Once I changed the formatting of that column in the excel spreadsheet to 6 decimal places, saved it as a csv, and then read it in with the informat of 12.6, the total sum came out to the correct number of 5,179,186. 

 

I didn't think about Excel's formatting! 

Reeza
Super User

Just a heads up that Excel is also bad at automatically rounding values and doesn't tell you, especially if it's coming from a CSV file and the numbers are large. Check what happens if you do a 16 digit number. 

 

You'll have the same issue in many systems (SAS) included so it's worth being aware that it can occur 🙂

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 5 replies
  • 1324 views
  • 2 likes
  • 3 in conversation