DATA Step, Macro, Functions and more

PROC MEANS and PROC UNIVARIATE producing different results for summary statistics

Reply
New Contributor
Posts: 2

PROC MEANS and PROC UNIVARIATE producing different results for summary statistics

[ Edited ]

My collegue and I are calculating summary statistics using two different methods.  We don't understand why the results of BASE_VALUE are not the same.  Both procedures present the standard deviation 0.005 but after performing the the put function below, the PROC MEANS StDev = 0.01 while PROC UNIVARIATE StDev = 0.00.  I have determined that the Hex Values are not the same but I don't know why or which value is more correct.  Any help or insight is much appreciated!  Thanks!!

 

strip(PUT(COL1,8.2))

 

 

 

data test ;
	INPUT VALUE 8. ;
	DATALINES ;
0.12
0.12
0.12
0.11
	;
RUN ;

* Proc Means ; proc means data = test noprint ; var VALUE ; output out = proc_mean mean = MeanS std = StdDevS min = MinS max = MaxS n = nS median = MedianS ; run ;
* Proc univariate ; proc univariate data = test noprint ; var VALUE ; output out = proc_univ mean = MeanS std = StdDevS min = MinS max = MaxS n = nS median = MedianS ; run ;
* Transpose both ; proc transpose data = proc_mean out= proc_mean2 ; run ; proc transpose data = proc_univ out= proc_univ2 ; run ;
* Proc means data - Round to 2dp, determine hex value, and print ; data proc_mean2 ; set proc_mean2 ; if _NAME_ in('STDDEVS') then do; BASE_VALUE = strip(PUT(COL1,8.2)) ; HEX_VALUE = put(COL1,HEX16.) ; end ; proc print ; var _NAME_ COL1 BASE_VALUE HEX_VALUE ; WHERE _NAME_ = 'STDDEVS' ; RUN ; * Proc univariate data- Round to 2dp, determine Hex value, and print ; data proc_univ2 ; set proc_univ2 ; if _NAME_ in('STDDEVS') then do; BASE_VALUE = strip(PUT(COL1,8.2)) ; HEX_VALUE = put(COL1,HEX16.) ; end ; proc print ; var _NAME_ COL1 BASE_VALUE HEX_VALUE ; WHERE _NAME_ = 'STDDEVS' ; RUN ;

Capture.PNG

 

 

Super User
Posts: 19,878

Re: PROC MEANS and PROC UNIVARIATE producing different results for summary statistics

[ Edited ]

Instead of PUT use ROUND(). 

 

I get the same results for the HEX values, but the Base values are the same. 

Note that I had to change some of your code to get this to run properly. 

 

PS. It helps if you add comments to your code.

 

data test ;
	INPUT VALUE 8. ;
	DATALINES ;
0.12
0.12
0.12
0.11
	;
RUN ;

proc means data = test noprint ;
	var VALUE ;	
	output out = proc_mean mean = MeanS std = StdDevS min = MinS max = MaxS n = nS median = MedianS ;
run ;

proc univariate data = test noprint ;
	var VALUE ;	
	output out = proc_univ mean = MeanS std = StdDevS min = MinS max = MaxS n = nS median = MedianS ;
run ;

proc transpose data = proc_mean out= proc_mean2 ; run ;
proc transpose data = proc_univ out= proc_univ2 ; run ;

data proc_mean3 ;
	set proc_mean2 ;
	if _NAME_ in('StdDevS') then do;
		BASE_VALUE = strip(put(round(COL1,0.01), 8.2)) ;
		HEX_VALUE = put(COL1,HEX16.) ;
	end ;
	proc print data=proc_mean3;
	var _NAME_ COL1 BASE_VALUE HEX_VALUE ;
	WHERE _NAME_ = 'StdDevS' ;
RUN ;


data proc_univ2 ;
	set proc_univ2 ;
	if _NAME_ in('StdDevS') then do;
		BASE_VALUE = strip(put(round(COL1, 0.01), 8.2)) ;
		HEX_VALUE = put(COL1,HEX16.) ;
	end ;
	proc print ;
	var _NAME_ COL1 BASE_VALUE HEX_VALUE ;
	WHERE _NAME_ = 'StdDevS' ;
RUN ;

 

 

EDIT: Slight mod to remove the notes in the log about character types. 

EDIT2: Tested on SAS 9.4 TS1M3 - you should make sure you're using the same versions as well, though I suspect that's not the issue - and it shouldn't be. 

Super User
Posts: 19,878

Re: PROC MEANS and PROC UNIVARIATE producing different results for summary statistics

And if you need the data transposed look at the ODS SUMMARY table instead.

New Contributor
Posts: 2

Re: PROC MEANS and PROC UNIVARIATE producing different results for summary statistics

Thanks Reeza!

 

When I change to ROUND(), I get the same BASE_VALUE, however the HEX values are still different between the procedure. I am hesitant to change to ROUND() because this is just one mismatch example.  The various other statistics have matching results, only this one value for SD does not match between PROC UNIVARIATE and PROC MEANS after PUT().  

Super User
Posts: 19,878

Re: PROC MEANS and PROC UNIVARIATE producing different results for summary statistics

@arg1 I agree. I think this is an artifact of the very small data and don't think the difference is significant. The values are the same, and I understand why you want confirmation though, and I think the next step would be SAS Tech Support. 

 

If you test it with a larger data set then it works fully as expected - the HEX and BASE values included. I think it's Excel that you can actually get it to return a negative variance - which isn't possible - but requires small N. 

 

data test ;
set sashelp.class;
value=weight;
	;
RUN ;
Super User
Posts: 5,518

Re: PROC MEANS and PROC UNIVARIATE producing different results for summary statistics

I think you will need to take this up with SAS.  I ran a small test program against your example:

 

proc compare base=proc_mean compare=proc_univ;

run;

 

I wasn't surprised to see that some variable labels are different.  But I was surprised to see that the standard deviations differ (to a tiny degree, on the order of e-18):

 

Obs      || StdDevS   StdDevS   Diff.      % Diff

 

________ || _________ _________ _________ _________

         ||

1        || 0.005000  0.005000  -2.6E-18  -5.2E-14

__________________________________________________________

 

Ask a Question
Discussion stats
  • 5 replies
  • 44 views
  • 4 likes
  • 3 in conversation