BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
Hello_there
Lapis Lazuli | Level 10

Hi,

I'm doing an efficiency exercise for data manipulation. Let's say I'm given the task of doing a descriptive summary that involves only taking the mean and standard deviation. And there is a rule I have to follow. The rule is that I'm supposed to extend out 1 decimal place more than the original data's highest decimal length value for the mean, and 2 decimal places more for the standard deviation. So if the maximum decimal length extends out 3 decimals places,  it should look something like this:

meanSD: x.xxxx (x.xxxxx)

 

I've heard about the putn function, but haven't been able to find many examples online on how to use it.

This is what I've come up with below. The reason I would like a dynamic format is so that in case the data changes and the amount of decimals of the original data changes, and I have to re-run my program, my logic will account for it.

 

This is what I was able to come up with so far.

data have; 
infile datalines dsd dlm=",";
	input val;
datalines;
54.5
67.657
578.2
284.5
546.32
4687.2
6584.5
567.3
498.1
89.31
;
run;

data have1; set have;
valchar=put(val,best.);
decimal=scan(valchar, 2, "."); 
run;

proc sql;
	select max(length(decimal)) into: maxdec trimmed
	from have1;
quit;
%put The maximum number of decimals for this data is &maxdec.;

proc means data=have mean std noprint;
var val;
output out=a (drop=_:) mean=mean std=std;
run;

* mean and std rule;
%let mean=1;
%let std=2;

data want; set a;
	meanSD=strip(put(round(mean, 10**(%eval(-&maxdec +&mean))), 32.%eval(&maxdec+&mean))) || " (" ||
	strip(put(round(std, 10**(%eval(-&maxdec +&std))), 32.%eval(&maxdec+&std))) || ")";
run;

Thanks

1 ACCEPTED SOLUTION

Accepted Solutions
PGStats
Opal | Level 21

I would go for something like this:

 

%let var1=val;

data have; 
input &var1.chr $12.;
&var1. = input(&var1.chr, best.);
&var1.dec = max(0, length(&var1.chr) - indexc(&var1.chr, ".", " "));
drop &var1.chr;
datalines;
54.5
67.657
578.2
284.5
546.32
4687.2
6584.5
567.3
498.1
89.31
25
;

/* Set a limit to the max nb of decimals, 
   to safeguard against calculated values such as 1.3333333333 */
%let maxdec=4;

proc sql noprint;
select 
    1 + min(&maxdec., max(&var1.dec)) as d,
    1 + calculated d
    into : fmtavg&var1. trimmed, : fmtstd&var1. trimmed
from have;
quit;

proc sql;
select 
    mean(&var1.) format=12.&&fmtavg&var1. label="Mean",
    std(&var1.) format=12.&&fmtstd&var1. label="Std. Dev."
from have;
quit;

PGStats_0-1675110474144.png

 

PG

View solution in original post

8 REPLIES 8
ballardw
Super User

You are going to have a hard time with your rules about decimal places. If you read the data as numeric then the value is numeric and there isn't really any natural end of decimal places stored. Additionally if you have lots of decimal places in your data you may run into issues with storage precision in 8 bytes.

So the only way I see to accurately follow your rules would be to also read your source data as a character value to count decimal places.

Note, if your external data source has already treated values a numeric then you may already be hosed as they other system may well have already rounded values, added/removed positions treated as significant.

 

If you want to see an example of what I mean instead of 67.657 the value 67.6570. Which would imply that last 0 is significant for some reason or the source wouldn't have used that last 0. But your approach would not count that, or any other trailing 0.

 

Instead of exponentiation and rounding let the format do that for you. Look into the CAT, CATS, CATX and CATQ functions instead of || operator. You can generally get away from having to STRIP, LEFT or TRIM as many values.

data want; set a;
	meanSD= catx(' ',put(mean,16.%eval(&maxdec.+1)),cats('(',put(std,16.%eval(&maxdec.+2)),')'));
run;

 

Hello_there
Lapis Lazuli | Level 10
Maybe it was my imagination, but I thought I needed the Round() function because PUT() function just cuts the numbers off regardless if the next digit is supposed to round the last presentable digit up. But I will experiment with it and see if that's the case.

And thanks for the CAT tip, I will do that more often than using the pipes.
Tom
Super User Tom
Super User

You cannot count on being able to determine the number of decimal places from looking at the actual data.  What if everything happened to be an integer?

 

Store the number of decimal places with the other metadata about the variable.

Then you can use that value with either the PUTN() function or perhaps just a ROUND() function.

Hello_there
Lapis Lazuli | Level 10
Thanks for your response, Tom.

The PUTN() function is a little bit of a mystery to me. I read the documentation on it and it still isn't clear to me how that can help me and how that's different than the PUT() function. Would you be able to offer an example and explain the difference?
Tom
Super User Tom
Super User

@Hello_there wrote:
Thanks for your response, Tom.

The PUTN() function is a little bit of a mystery to me. I read the documentation on it and it still isn't clear to me how that can help me and how that's different than the PUT() function. Would you be able to offer an example and explain the difference?

The PUT() function requires the format to be known at compile time. That is why it works for either numeric or character values.  The PUTN() and PUTC() can wait until execution time to determine the format to apply, but that means you have to tell the compiler which type of values to expect by using PUTN() or PUTC().

 

So what this means is that the for the SECOND argument the PUT() function was an actual format specification. Such as COMMA12. and the PUTN() function wants a character string that has the format specification, such as "COMM12.".  

 

So if you have a variable named MAXDEC you can use it to generate the string you pass to PUTN().

 

Example:

data test;
  do x=0.1,1,10 ;
    do maxdec=0,1,2 ;
      string=putn(x,cats('5.',maxdec));
      output;
    end;
  end;
run;
OBS       x    maxdec    string

 1      0.1       0          0
 2      0.1       1        0.1
 3      0.1       2       0.10
 4      1.0       0          1
 5      1.0       1        1.0
 6      1.0       2       1.00
 7     10.0       0         10
 8     10.0       1       10.0
 9     10.0       2      10.00

 

Hello_there
Lapis Lazuli | Level 10
Tom, thanks for the explanation and the example! This helps my understanding.
PGStats
Opal | Level 21

I would go for something like this:

 

%let var1=val;

data have; 
input &var1.chr $12.;
&var1. = input(&var1.chr, best.);
&var1.dec = max(0, length(&var1.chr) - indexc(&var1.chr, ".", " "));
drop &var1.chr;
datalines;
54.5
67.657
578.2
284.5
546.32
4687.2
6584.5
567.3
498.1
89.31
25
;

/* Set a limit to the max nb of decimals, 
   to safeguard against calculated values such as 1.3333333333 */
%let maxdec=4;

proc sql noprint;
select 
    1 + min(&maxdec., max(&var1.dec)) as d,
    1 + calculated d
    into : fmtavg&var1. trimmed, : fmtstd&var1. trimmed
from have;
quit;

proc sql;
select 
    mean(&var1.) format=12.&&fmtavg&var1. label="Mean",
    std(&var1.) format=12.&&fmtstd&var1. label="Std. Dev."
from have;
quit;

PGStats_0-1675110474144.png

 

PG
Hello_there
Lapis Lazuli | Level 10
Thanks for taking time to write this! This gives me an alternative view and teaches me some things that are new to me.

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 8 replies
  • 532 views
  • 3 likes
  • 4 in conversation