BookmarkSubscribeRSS Feed
arg1
Calcite | Level 5

My collegue and I are calculating summary statistics using two different methods.  We don't understand why the results of BASE_VALUE are not the same.  Both procedures present the standard deviation 0.005 but after performing the the put function below, the PROC MEANS StDev = 0.01 while PROC UNIVARIATE StDev = 0.00.  I have determined that the Hex Values are not the same but I don't know why or which value is more correct.  Any help or insight is much appreciated!  Thanks!!

 

strip(PUT(COL1,8.2))

 

 

 

data test ;
	INPUT VALUE 8. ;
	DATALINES ;
0.12
0.12
0.12
0.11
	;
RUN ;

* Proc Means ; proc means data = test noprint ; var VALUE ; output out = proc_mean mean = MeanS std = StdDevS min = MinS max = MaxS n = nS median = MedianS ; run ;
* Proc univariate ; proc univariate data = test noprint ; var VALUE ; output out = proc_univ mean = MeanS std = StdDevS min = MinS max = MaxS n = nS median = MedianS ; run ;
* Transpose both ; proc transpose data = proc_mean out= proc_mean2 ; run ; proc transpose data = proc_univ out= proc_univ2 ; run ;
* Proc means data - Round to 2dp, determine hex value, and print ; data proc_mean2 ; set proc_mean2 ; if _NAME_ in('STDDEVS') then do; BASE_VALUE = strip(PUT(COL1,8.2)) ; HEX_VALUE = put(COL1,HEX16.) ; end ; proc print ; var _NAME_ COL1 BASE_VALUE HEX_VALUE ; WHERE _NAME_ = 'STDDEVS' ; RUN ; * Proc univariate data- Round to 2dp, determine Hex value, and print ; data proc_univ2 ; set proc_univ2 ; if _NAME_ in('STDDEVS') then do; BASE_VALUE = strip(PUT(COL1,8.2)) ; HEX_VALUE = put(COL1,HEX16.) ; end ; proc print ; var _NAME_ COL1 BASE_VALUE HEX_VALUE ; WHERE _NAME_ = 'STDDEVS' ; RUN ;

Capture.PNG

 

 

5 REPLIES 5
Reeza
Super User

Instead of PUT use ROUND(). 

 

I get the same results for the HEX values, but the Base values are the same. 

Note that I had to change some of your code to get this to run properly. 

 

PS. It helps if you add comments to your code.

 

data test ;
	INPUT VALUE 8. ;
	DATALINES ;
0.12
0.12
0.12
0.11
	;
RUN ;

proc means data = test noprint ;
	var VALUE ;	
	output out = proc_mean mean = MeanS std = StdDevS min = MinS max = MaxS n = nS median = MedianS ;
run ;

proc univariate data = test noprint ;
	var VALUE ;	
	output out = proc_univ mean = MeanS std = StdDevS min = MinS max = MaxS n = nS median = MedianS ;
run ;

proc transpose data = proc_mean out= proc_mean2 ; run ;
proc transpose data = proc_univ out= proc_univ2 ; run ;

data proc_mean3 ;
	set proc_mean2 ;
	if _NAME_ in('StdDevS') then do;
		BASE_VALUE = strip(put(round(COL1,0.01), 8.2)) ;
		HEX_VALUE = put(COL1,HEX16.) ;
	end ;
	proc print data=proc_mean3;
	var _NAME_ COL1 BASE_VALUE HEX_VALUE ;
	WHERE _NAME_ = 'StdDevS' ;
RUN ;


data proc_univ2 ;
	set proc_univ2 ;
	if _NAME_ in('StdDevS') then do;
		BASE_VALUE = strip(put(round(COL1, 0.01), 8.2)) ;
		HEX_VALUE = put(COL1,HEX16.) ;
	end ;
	proc print ;
	var _NAME_ COL1 BASE_VALUE HEX_VALUE ;
	WHERE _NAME_ = 'StdDevS' ;
RUN ;

 

 

EDIT: Slight mod to remove the notes in the log about character types. 

EDIT2: Tested on SAS 9.4 TS1M3 - you should make sure you're using the same versions as well, though I suspect that's not the issue - and it shouldn't be. 

Reeza
Super User

And if you need the data transposed look at the ODS SUMMARY table instead.

arg1
Calcite | Level 5

Thanks Reeza!

 

When I change to ROUND(), I get the same BASE_VALUE, however the HEX values are still different between the procedure. I am hesitant to change to ROUND() because this is just one mismatch example.  The various other statistics have matching results, only this one value for SD does not match between PROC UNIVARIATE and PROC MEANS after PUT().  

Reeza
Super User

@arg1 I agree. I think this is an artifact of the very small data and don't think the difference is significant. The values are the same, and I understand why you want confirmation though, and I think the next step would be SAS Tech Support. 

 

If you test it with a larger data set then it works fully as expected - the HEX and BASE values included. I think it's Excel that you can actually get it to return a negative variance - which isn't possible - but requires small N. 

 

data test ;
set sashelp.class;
value=weight;
	;
RUN ;
Astounding
PROC Star

I think you will need to take this up with SAS.  I ran a small test program against your example:

 

proc compare base=proc_mean compare=proc_univ;

run;

 

I wasn't surprised to see that some variable labels are different.  But I was surprised to see that the standard deviations differ (to a tiny degree, on the order of e-18):

 

Obs      || StdDevS   StdDevS   Diff.      % Diff

 

________ || _________ _________ _________ _________

         ||

1        || 0.005000  0.005000  -2.6E-18  -5.2E-14

__________________________________________________________

 

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 5 replies
  • 1869 views
  • 4 likes
  • 3 in conversation