BookmarkSubscribeRSS Feed
macariarobinson
Calcite | Level 5

I've noticed that the sort can slightly alter results of a proc means, even if the data is exactly the same. Has anyone else experienced this? I've bypassed the issue by just rounding everything prior to formatting my results from a proc means output, but I'm very curious to know why SAS's internal processors can give different results based on rounding. Test data below to explain what I mean. (Using SAS 9.4)

* Data set sorted by month, year, pressure;
data test1;
	set sashelp.enso;
	proc sort;
		by month year pressure;
run;
* Same data set sorted by pressure, month, year;
data test2;
	set sashelp.enso;
	proc sort;
		by pressure month year;
run;
* Proc Means;
proc means data = test1 noprint;
	var year;
  output n = XN1 mean = XMEAN1 std = XSTD1 out = _result1;
run;
* Proc Means;
proc means data = test2 noprint;
	var year;
  output n = XN1 mean = XMEAN1 std = XSTD1 out = _result2;
run;

** compare **;
proc compare base=_result1 compare=_result2;
run;
 
** standardized output **;
data _check1;length display $15;
set _result1;
display=strip(put(XMEAN1,10.2));
run;
data _check2;length display $15;
set _result2;
display=strip(put(XMEAN1,10.2));
run;
 
** compare **;
proc compare base=_check1 compare=_check2;
run;

 

2 REPLIES 2
ballardw
Super User

This is another example of numeric precision and storage of decimals in a binary system. The order of addition to the accumulated sum would affect which bits more than the precision of the storage allows gets lost. Note that most of the values of Year in that data set apparently represent some form of repeating decimal such as 1 and1/3 stored as 1.333333333333 (which has already lost some information if that is the case as the truncation at a specific point is not the same as 1/3.

 

Which is one reason compare has the FUZZ option for considering "how close to you want to consider results equal"

Reeza
Super User

The differences are not meaningful. They're 0.0000000000014 or along those lines.

If differences of this size are meaningful you're going to have issues with any software because numerical precision is an issue with anything that uses binary representation of numbers, ie computers today. Until quantam computing is more real. 

 

That's not to say that you can't work around these obstacles if needed. There's an entire section in the SAS documentation dedicated to numerical precision.

 

http://support.sas.com/documentation/cdl/en/lrcon/69852/HTML/default/viewer.htm#p0ji1unv6thm0dn1gp4t...

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 2 replies
  • 672 views
  • 0 likes
  • 3 in conversation