Re: Counting decimals in numeric fields.

Longimanus · Posted 06-06-2024 09:28 AM

Hi there. 🤔

I have an issue with decimals in numeric fields. I read somewhere SAS can be pretty accurate up to 15 - 16 decimals. Say let's say I have a z19:16 formatted numeric field (I need to format it (I think!) to get more than 10 decimals which I think is the standard if the variable is just defined as numeric) that contains: 12.775. I need to send this to a mainframe to which we only can send integers. So to make the mainframe understand it is 12.775, I need to send 12775 and the no. of decimal positions as a separate number, i.e. 3. Another example would be 12.6802 where I have to send 126802 and 4 to "tell" the mainframe to divide by 10.000. So my problem is this: I am not so much into HEX codes and what is going on regarding the internal form SAS operates with with huge number - either before or after the decimal point. The solutions I found on the net use SUBST and SCAN function that normally would do good for text variables but ALSO it seems for numbers up to 10 digits before or after the decimal point (I guess that relates to the entire HEX issue). Yes, I do sound like somebody who need to train to understand this but I have a deadline and need to solve this quick.

So in short .. the code I'll send after this text works for values up to 10 digits. If I have more the result is unpredictable. I need to code this different than what I did below. I need some code that can give me 12 for 77,123456789012, 3 for 66,378, 14 for 95,12345678901234. How do code that people?

What I did to get the right the answer ... (that only works up to 10 decimals). I build a table with some values and then use two methods I found on the internet to determine the number of decimals:

Run the code if you want and then read what I wrote after it.

/* Create a little test data. */

data decimal_play;

format pi_1 pi_2 pi_3 pi_4 pi_5 Z20.17 ;

pi_1 = 3.141592650000000;

pi_2 = 3.141592653000000;

pi_3 = 3.141592653500000;

pi_4 = 3.141592653580000;

pi_5 = 3.141592653589793;

run;

/* Method 1 */

data Method_1 (keep = dec_pi_1 dec_pi_2 dec_pi_3 dec_pi_4 dec_pi_5);;

set decimal_play;

dec_pi_1 = lengthn(scan(cat(pi_1),2,'.'));

dec_pi_2 = lengthn(scan(cat(pi_2),2,'.'));

dec_pi_3 = lengthn(scan(cat(pi_3),2,'.'));

dec_pi_4 = lengthn(scan(cat(pi_4),2,'.'));

dec_pi_5 = lengthn(scan(cat(pi_5),2,'.'));

output;

run;

/* Method 2 */

data Method_2 (keep = dec_pi_1 dec_pi_2 dec_pi_3 dec_pi_4 dec_pi_5);

set decimal_play;

first = substr(pi_1, 1, index(pi_1, '.') - 1); second = left(substr(pi_1, index(pi_1, '.') + 1)); dec_pi_1 = LENGTH(second);

first = substr(pi_2, 1, index(pi_2, '.') - 1); second = left(substr(pi_2, index(pi_2, '.') + 1)); dec_pi_2 = LENGTH(second);

first = substr(pi_3, 1, index(pi_3, '.') - 1); second = left(substr(pi_3, index(pi_3, '.') + 1)); dec_pi_3 = LENGTH(second);

first = substr(pi_4, 1, index(pi_4, '.') - 1); second = left(substr(pi_4, index(pi_4, '.') + 1)); dec_pi_4 = LENGTH(second);

first = substr(pi_5, 1, index(pi_5, '.') - 1); second = left(substr(pi_5, index(pi_5, '.') + 1)); dec_pi_5 = LENGTH(second);

run;

So how should I code this nicely? Code that reads a number, gives me the number as an integer (max 18 long, i.e.: 2 before the decimal point and 16 after) and the number of decimals is a separate variable.

3.141592650000000 as 314159265 & 8

3.141592653000000 as 3141592653 & 9

3.141592653500000 as 31415926535 & 10

3.141592653580000 as 314159265358 & 11

3.141592653589793 as 3141592653589793 & 15

Best regards,

Menno 🙈

ballardw · Posted 06-06-2024 10:27 AM

Two things to start: default FORMAT for numeric values is BEST12. That means SAS will attempt to fit any number into a maximum of 12 characters.

Second: the FORMAT has no effect on the actual values stored. Your FORMAT says how you want to display the value. So assigning a format of z19:16 only means that you are forcing a leading 0 if the integer portion is less than 10 and forcing 16 decimal places in the result:

As a somewhat extreme example run this program and look at the log:

data _null_;
   x= .5;
   put 'Best5. ' x=best5.;
   put 'z19.16 ' x=z19.16;
   put 'E10.   ' x=e10.;
   put 'Time.  ' x=time.;
   put 'Date9. ' x=date9.;
run;

Unless you read the original values from a text file and store such information at that time, for each variable and observation then this exercise is next to meaningless as the values have already been limited . The values stored in SAS do not keep any information about number of significant decimal places.

If it is over 14 decimals the values from SAS are likely to be a bit suspect at that 15th (or higher) decimal point due to computer storage precision limited by 8 bytes (nothing to do with SAS, everything to do with binary representation of decimal values).

What significant impact would it have on the receiving program to have the value as 31 and 1 (decimal) or 3100 and 3 or 310000 and 5? If the answer is None then just pick a precision and deal

data junk;
  input x;
  int= put(x*1E15,f18. -L);
  decimals=15;
datalines;
3.141592650000000 
3.141592653000000 
3.141592653500000 
3.141592653580000
3.141592653589793 
run;

Longimanus · Posted 06-07-2024 04:44 AM

I will have a look at what you replied. Thanx a lot. I will of course tell all you guys how I solved it. Thanx for clearing some wrong assumptions I add. 😉

Astounding · Posted 06-06-2024 03:56 PM

Given the limitations and constraints you describe, one approach stands out to me. Send two integers to the mainframe, not one. Send digits before the decimal point as one field, and digits after the decimal point as a second field. Then let the mainframe programmers worry about how to put them together. You maintain the greatest accuracy by having the mainframe receive data that is 100% correct.

Tom · Posted 06-06-2024 08:59 PM

If you need to now know the precision of the number you cannot start with just the number since the precision is not stored with the number. Only the number itself is stored. So the following strings imply different number of digits of precision:

3.141592650000000
3.14159265000000
3.1415926500000
3.141592650000
3.14159265000
3.1415926500
3.141592650
3.14159265

But they will all be stored as the exact same number.

So where is the precision of the number supposed to come from? Is it something you know it advance? For example why did you decide that

3.141592650000000 has only 8 digits of precision? Why not 9 or 10 or 15?

Longimanus · Posted 06-07-2024 04:41 AM

I'd say because those 8 count. The zeroes after that don't change the number. So when the company decides we run with 10 or 15 decimals I'm only interested in the last number right of the decimal point that is not a zero. The remainder is irrelevant.

Ksharp · Posted 06-06-2024 09:19 PM

data decimal_play;
input pi;
format pi Z20.17 ;
cards;
3.141592650000000
3.141592653000000
3.141592653500000
3.141592653580000
3.141592653589793
3
.314
0
; 
data want;
 set decimal_play;
 want=lengthn(prxchange('s/0+$//',1 ,strip(scan(vvalue(pi),-1,'.'))));
run;

Longimanus · Posted 06-07-2024 04:50 AM

I will check this out ... look's nice and easy ...

Longimanus · Posted 06-07-2024 04:56 AM

I just run your code. Curious to see the result. Spot on 🙂 The numbers are in character variables and not in numeric ones. What I can make up from what I read (in the suggestions of all of you) is that I cannot avoid using character variables to solve this. I'll get back to all of you how I have solved it in the end.

s_lassen · Posted 06-07-2024 04:30 AM

I think @ballardw gave a pretty good answer about how to get a solution.

I just want to issue a word of warning:

SAS can be pretty accurate up to 15 - 16 decimals

Yes, and no. SAS can be pretty accurate up to 15 or 16 DIGITS, not 15 or 16 DECIMALS. So if you try to code a number like 95,12345678901234 as an integer with 16 digits, you run into several problems. The first being that the internal representation of this number is very unlikely to be exact, the second is that the integer representation is probably not exact either:

 69         data _null_;
 70           a=95.12345678901235;
 71           b=a*1e14;
 72           put a= 17.14 b=16.0;
 73         run;
 
 a=95.12345678901230 b=9512345678901236

(I changed the last decimal on the input to demonstrate both problems with a single input).

Anyway, if you are sending stuff to a mainframe, the programmers there will probably have an easier time reading stuff with a fixed number of decimals. So unless you have some numbers with many digits before the decimal point (and few after) and others with many digits after the decimal point (and few before), it is probably better to stick with e.g. 12 decimals for all numbers.

If you must find the (approximately correct) number of decimals, you could try something like

b=left(put(number,best32.));
if index(b,'.') then
  decimals=length(b)-index(b,'.');
else
  decimals=0;
int_value=compress(b,'.'); /* get the integer value as a string */

This way (using a string for the integer value) the integer will be compliant with the decimal value shown by SAS (none of them are necessarily exactly correct).

Longimanus · Posted 06-07-2024 04:49 AM

Thank you for that nice explanation. I will look into this and try to use what you suggested and inject my know-how of how SAS deals with numbers internally. Chat GTP also came up with a solution where I need to use text variables to solve this problem. I had some time in the bus yesterday and - for fun - asked Chat GTP how it would solve my problem. It came with a solution but as we all know Chat GTP can make anything look good. I am going to run that too. I will be back to all you guys about what my final solution will look like.

Ready to join fellow brilliant minds for the SAS Hackathon?

Classroom Training Available!