Hi,
I am trying to create a function in FCMP to convert a numeric to character and print to s significant figures. The function has three inputs, x - the numeric value, s - number of significant figures to be reported, l - number of figures to the right of the decimal point when printing (to allow decimal alignment when using different s.f. across different observations.
The function works but prints a w.d. format was too small to be printed note, I pulled all the code out into a dataset (test2) instead of the function and the note disappears. Any idea where it comes from or how I could debug?
While testing I found the issue comes only from the derivation of c and d and only for the first record in test. Test2 is set to produce same output as variable c in Test1.
data test;
a=2981248.76403; OUTPUT; a=1248.76403; OUTPUT; a=248.76403; OUTPUT;a=48.76403; output; a=8.76403;;OUTPUT;a=0.76403;;OUTPUT;a=0.076403;;OUTPUT;a=0.0076403;;OUTPUT;a=0.00076403;;OUTPUT;a=0.000076403;;OUTPUT;a=0.00007640387106801;;OUTPUT;
RUN;
proc fcmp outlib=work.funcs.test;
function sigdig(x,s,l) $200;
length g e f $200;
if x then do;
a=floor(log10(abs(x)));
b=a-(s-1);
c=round(x,10**(b));
cc=compress(put(x, best32.));
ccc=length(compress(scan(cc, 2, ".")));
d=floor(log10(abs(c)));
e=cat(compress(put(l, best.)),".");
ee=2-d+(s-3);
eee=if ccc>0 and ee>0 then min(ccc, ee)
else ee;
ff=if eee<=0 THEN l
ELSE l+1+eee;
f=cat(compress(put(ff, best.)),".",compress(put(eee, best.)));
g=if eee<=0 then putn(c,e)
else putn(c,f);
end;
else do;
e=cat(compress(put(l, best.)),".");
g=putn(c,e);
end;
return(g);
endsub;
run;
options cmplib=work.funcs;
data test1;
SET test;
c=sigdig(a, 3, 8);
d=sigdig(a, 4, 8);
e=sigdig(a, 5, 8);
f=sigdig(a, 6, 8);
g=sigdig(a, 7, 8);
h=sigdig(a, 8, 8);
RUN;
data test2;
set test (rename=(a=x));
l=8;
s=3;
length g e f $200;
if x then do;
a=floor(log10(abs(x)));
b=a-(s-1);
c=round(x,10**(b));
cc=compress(put(x, best32.));
ccc=length(compress(scan(cc, 2, ".")));
d=floor(log10(abs(c)));
e=cat(compress(put(l, best.)),".");
ee=2-d+(s-3);
if ccc>0 and ee>0 then eee=min(ccc, ee);
else eee=ee;
if eee<=0 THEN ff=l;
ELSE ff=l+1+eee;
f=cat(compress(put(ff, best.)),".",compress(put(eee, best.)));
if eee<=0 then g=putn(c,e);
else g=putn(c,f);
end;
else do;
e=cat(compress(put(l, best.)),".");
g=putn(c,e);
end;
run;
Apologies for the messiness of the code, I was going round in circles while testing,
Thanks
First question: Why not simply use the Put Function?
For the data in TEST, could you give the code to print to 3 significant figures?
As far as I can see I would need to use a different put statement for each line.
Suggest you provide some sample data that also include a column with the desired result. This should remove a lot of ambiguity.
I don't fully understand what you're after. For example: How many significant figures (digits?) has 100.0? And how would you want to display it if you're after sample code "to print to 3 significant figures".
The last set of digits in your sample represents a number that SAS can't store with full precision.
data test;
attrib
a_num length=8 format=best32. informat=best32.
a_char length=$32 format=$32. informat=$32.;
input @1 a_num @1 a_char;
datalines;
2981248.76403
1248.76403
248.76403
48.76403
8.76403
0.76403
0.076403
0.0076403
0.00076403
0.000076403
0.00007640387106801
;
RUN;
proc print data=test;
run;
@Patrick wrote:
The last set of digits in your sample represents a number that SAS can't store with full precision.
I think the internal precision is sufficient; it's just the BEST32. format rounding incorrectly in this case (whereas the E32. format shows all relevant digits).
Below is a print of TEST1y, based on the input dataset TEST, it rounds column A (numeric) to 3 significant figures in col C, 4s.f. in col D, 5s.f in col E, 6s.f in col F, 7s.f in col H
The function is working as expected, it just produces a w.d. format note that I cannot reproduce when I use the same code in a datastep.
@FreelanceReinh Yes, you're right. No precision issue. Thanks for pointing that out.
@SwissC Before writing the function I suggest you get data step code as clean as possible.
From what I understand there are two separate problems:
1. How to create a number with the desired significant figures?
2. How to display/print numbers?
I believe you solved 1 already.
Point 2 is a bit harder. There isn't a format that does such alignment around a decimal point. The only way I can think of is to create a string (character variable) with leading spaces. But that won't be enough because you have then also to use a non-proportional (monospaced) typeset like Courier AND you need to ensure that the leading blanks are protected - and how to do this will depend on the output destination (like and/or the <pre> tag for HTML).
Below sample code works for writing to the SAS log and should also work for listing output. Other output destinations will require additional work.
data test;
attrib
char_value length=$32 informat=$32.
value length=8 informat=best32. format=f32.31;
input @1 char_value @1 value;
datalines;
2981248.76403
1248.76403
248.76403
48.76403
8.76403
0.76403
0.076403
0.0076403
0.00076403
0.000076403
0.00007640387106801
-0.000076403
;
RUN;
%let sig_figures=3;
data rounded;
set test;
/** populate numerical variable with desired number of significan figures **/
format SigFig_Value best32.;
sig_figures=&sig_figures;
/* Determine the number of significant figures */
if value = 0 then SigFig_Value = 0;
else SigFig_Value = round(value, 10**floor(log10(abs(value))-sig_figures+1));
/** create character variable with leading blanks so decimal point is always at the same string position **/
format char_SigFig_Value $char60.;
char_SigFig_Value=put(SigFig_Value,best32. -l);
char_SigFig_Value=cat(repeat(' ',18-length(scan(char_SigFig_Value,1,'.'))),char_SigFig_Value);
/* write character variable to log where typeset is Courier */
put char_SigFig_Value $char60.;
run;
/* proc print data=rounded noobs; */
/* format char_SigFig_Value $char60.; */
/* run; */
That is useful. The use of BEST to make the string causes issues however when the least significant of the digits is zero. Which you can see with your example data when 8 significant are asked for.
-2981248.8
2981248.8
1248.764
248.76403
48.76403
8.76403
0.76403
0.076403
0.0076403
0.00076403
0.000076403
0.000076403871
-0.000076403
It probably would be better to convert the number to an integer and then add back the decimal point (and possibly leading zeros) at the proper place.
if x then magnitude = 1+floor(log10(abs(x)));
else magnitude=1;
integer = round(x * 10**(sig_figures-magnitude));
I am going to make a style comment about your function.
If I had inherited code with that function definition as shown my first reaction is "Why the !@#$!@#$!@# are there no comments at all?"
The second is "Why does this exist? I have no way to tell why it is needed".
Third is "Do I really have to spend a bunch of hours trying to figure out what each of these parameters is supposed to be for?"
Any function should in the code include comments as an absolute minimum a description of the expected values for the input parameters. Such as "values must be integers". Since this is claiming to do something related formats then likely one of the parameters must be greater than 0 and less than 32 (whatever equivalent to W) . If the parameters have an interaction, such as W and D in formats (D must be < W) that should be stated. Better would be also include comments as why the function is needed.
I didn't spend a lot of time trying to parse the code but I am not sure which is the number of decimals and which is the number of significant digits. Which means I can't make a suggestion of where in your code you have to test the VALUE whether the number of significant digits makes sense or if the W in your PUTN calls is large enough. Hint: if the value is 1000 or greater W must be 4 or larger or Put or Putn will generate the " W.D format was too small for the number to be printed."
If the error occurs in the calculation of C then can you explain the meaning of the value of C?
While you are at it explain what the values of A and B, which are used to create C, are.
What is CC?
proc fcmp outlib=work.funcs.test;
function sigdig(x,s,l) $200;
length g e f $200;
if x then do;
a=floor(log10(abs(x)));
b=a-(s-1);
c=round(x,10**(b));
cc=compress(put(x, best32.));
ccc=length(compress(scan(cc, 2, ".")));
From the assignment statement it looks like you want it to be a character variable.
But you did not set any length for it, like you did for G E and F.
Why are you using COMPRESS() function there? Did you mean to use the LEFT() function to remove the leading spaces? If so then just add the -L option to the format specification.
392 data test; 393 x=2981248.76403; 394 string1=put(x,best32.); 395 string2=put(x,best32.-L); 396 format string: $quote. ; 397 put (_all_) (=/); 398 run; x=2981248.764 string1=" 2981248.76403" string2="2981248.76403"
You should clean up those and other similar things and see if the note goes away.
Your code seems needlessly complex. You should leverage the existing formats.
For example:
data VALUES;
infile cards truncover;
input VAL 32.;
cards;
-2981248.8
2981248.8
1248.764
248.76403
48.76403
8.76403
0.76403
0.076403
0.0076403
0.00076403
0.000076403
0.000076403871
-0.000076403
;
data TEST;
set VALUES;
SIG = 3;
E = putn(VAL, cats('E', SIG+6, '.'));
NUM = input(E, 32.);
run;
proc print;
format _NUMERIC_ best32.;
run;
Obs. | VAL | SIG | E | NUM |
---|---|---|---|---|
1 | -2981248.8 | 3 | -2.98E+06 | -2980000 |
2 | 2981248.8 | 3 | 2.98E+06 | 2980000 |
3 | 1248.764 | 3 | 1.25E+03 | 1250 |
4 | 248.76403 | 3 | 2.49E+02 | 249 |
5 | 48.76403 | 3 | 4.88E+01 | 48.8 |
6 | 8.76403 | 3 | 8.76E+00 | 8.76 |
7 | 0.76403 | 3 | 7.64E-01 | 0.764 |
8 | 0.076403 | 3 | 7.64E-02 | 0.0764 |
9 | 0.0076403 | 3 | 7.64E-03 | 0.00764 |
10 | 0.00076403 | 3 | 7.64E-04 | 0.000764 |
11 | 0.000076403 | 3 | 7.64E-05 | 0.0000764 |
12 | 0.000076403871 | 3 | 7.64E-05 | 0.0000764 |
13 | -0.000076403 | 3 | -7.64E-05 | -0.0000764 |
Perhaps something like this:
proc fcmp outlib=WORK.FUNCS.TEST;
function sigdig(VAL, SIG, FMT $) $32;
%* Input validation to be added;
length E $32;
E = putn(VAL, cats('E', SIG+6, '.'));
if upcase(FMT)='E' then return(strip(E));
else return(cat(input(E, 32.)));
endsub;
run;
options cmplib=WORK.FUNCS;
data FCT;
set VALUES;
NUM1 = sigdig(VAL, 4, 'S');
NUM2 = sigdig(VAL, 4, 'E');
run;
Obs. | VAL | NUM1 | NUM2 |
---|---|---|---|
1 | -2981248.8 | -2981000 | -2.981E+06 |
2 | 2981248.8 | 2981000 | 2.981E+06 |
3 | 1248.764 | 1249 | 1.249E+03 |
4 | 248.76403 | 248.8 | 2.488E+02 |
5 | 48.76403 | 48.76 | 4.876E+01 |
6 | 8.76403 | 8.764 | 8.764E+00 |
7 | 0.76403 | 0.764 | 7.640E-01 |
8 | 0.076403 | 0.0764 | 7.640E-02 |
9 | 0.0076403 | 0.00764 | 7.640E-03 |
10 | 0.00076403 | 0.000764 | 7.640E-04 |
11 | 0.000076403 | 0.0000764 | 7.640E-05 |
12 | 0.000076403871 | 0.0000764 | 7.640E-05 |
13 | -0.000076403 | -0.0000764 | -7.640E-05 |
Or this if you want to ensure the display of all digits (see report lines 4 and 14).
data VALUES;
infile cards truncover;
input VAL 32.;
cards;
-2981248.8
2981248.8
1248.764
248.076403
48.76403
8.76403
0.76403
0.076403
0.0076403
0.00076403
0.000076403
0.000076403871
-0.000076403
-0.000076403
1
;
proc fcmp outlib=WORK.FUNCS.TEST;
function sigdig(VAL, SIG, FMT $) $32;
%* Input validation to be added;
length E F $32;
E = putn(VAL, cats('E', SIG+6, '.'));
if upcase(FMT)='E' then return(strip(E));
else do;
F = cat(input(E, 32.));
L = length(F);
if L < SIG then F = cats(F,'.',repeat('0',SIG-L-1));
return(strip(F));
end;
endsub;
run;
options cmplib=WORK.FUNCS;
data FCT;
set VALUES;
NUM1 = sigdig(VAL, 4, 'S');
NUM2 = sigdig(VAL, 4, 'E');
run;
Obs. | VAL | NUM1 | NUM2 |
---|---|---|---|
1 | -2981248.8 | -2981000 | -2.981E+06 |
2 | 2981248.8 | 2981000 | 2.981E+06 |
3 | 1248.764 | 1249 | 1.249E+03 |
4 | 248.0076403 | 248.0 | 2.480E+02 |
5 | 48.76403 | 48.76 | 4.876E+01 |
6 | 8.76403 | 8.764 | 8.764E+00 |
7 | 0.76403 | 0.764 | 7.640E-01 |
8 | 0.076403 | 0.0764 | 7.640E-02 |
9 | 0.0076403 | 0.00764 | 7.640E-03 |
10 | 0.00076403 | 0.000764 | 7.640E-04 |
11 | 0.000076403 | 0.0000764 | 7.640E-05 |
12 | 0.000076403871 | 0.0000764 | 7.640E-05 |
13 | -0.000076403 | -0.0000764 | -7.640E-05 |
14 | 1 | 1.000 | 1.000E+00 |
Save $250 on SAS Innovate and get a free advance copy of the new SAS For Dummies book! Use the code "SASforDummies" to register. Don't miss out, May 6-9, in Orlando, Florida.
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.