- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hello everyone,
I have a database with values saved as character. The number of decimal points can vary between observation I.e. both 10,132 and 6,254252 could be found.
I want to convert these character values to a numeric value while retaining all decimal points. Using input(char,best.) yields me 1 decimal points. Simply multiplying the character string with 1 yields me 5. Exactly what is going on here, and how can i adjust the code to include all/more decimals?
See example below.
data test;
format CharVar $50. value_num1 value_num2 20.14;
CharVar="7663229328,8184132475796";
Value_num1 = input(tranwrd(CharVar,",","."), best.);
Value_num2 = tranwrd(CharVar,",",".")*1;
run;
CharVar= 7663229328,8184132475796
Value_num1=7663229328.800000000
Value_num2=7663229328.818410000
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hi @Shawnty
SAS stores numerical values as double precision floating point values. That means that any value is truncated to 16 digits (the largest exact number represented is 9007199254740992), and the decimal point can be placed anywhere (that's what floating point means). So your test value cannot be represented better than 7663229328.81841, the remaining digits after the decimal point will always be lost.
data test;
CharVar="7663229328,8184132475796";
Value_num3 = input(CharVar,numx26.13);
put Value_num3 25.13;
put Value_num3 20.8;
put Value_num3 17.5;
run;
7663229328.8184100000000
7663229328.81841000
7663229328.81841
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
First, what's going on here ...
The BEST family of informats has a default width. Evidently, if you don't specify a width such as best8 or best14, it will read 12 characters. So that's why you end up with 1 decimal place.
So your second approach would be more accurate. However, SAS cannot store the full number exactly. SAS stores numbers in 8 bytes. Internally, that allows it to store roughly 15 digits, plus an indication of positive or negative, plus an indication of where the decimal point should go. If you want the full value stored as numeric, you would have to choose some other plan. For example, you could keep the variable as character. Or you could split the variable into two pieces (digits before the decimal point, and digits after the decimal point). There is no way that SAS can accurately store the full value in one numeric variable.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hello,
All three conversions below end up with 5 decimals.
Not sure why it "stops" at 5 decimals.
data test;
format CharVar $50. /* value_num1 value_num2 20.14 */;
CharVar="7663229328,8184132475796";
Value_num1 = input(tranwrd(CharVar,",","."), best32.); put Value_num1= 32.15;
Value_num2 = tranwrd(CharVar,",",".")*1; put Value_num2= 32.15;
Value_num3 = input(CharVar,commax32.); put Value_num3= 32.15;
run;
/* end of program */
Will investigate further tomorrow.
Koen
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
@sbxkoenk wrote:
Hello,
All three conversions below end up with 5 decimals.
Not sure why it "stops" at 5 decimals.
data test; format CharVar $50. /* value_num1 value_num2 20.14 */; CharVar="7663229328,8184132475796"; Value_num1 = input(tranwrd(CharVar,",","."), best32.); put Value_num1= 32.15; Value_num2 = tranwrd(CharVar,",",".")*1; put Value_num2= 32.15; Value_num3 = input(CharVar,commax32.); put Value_num3= 32.15; run; /* end of program */
Will investigate further tomorrow.
Koen
Simple, too many significant digits. Lots of precision discussions on this and other forums. You get about 15 significant digits in a Windows environment.
Move the comma and you get more decimals as the significant digits in the integer portion are fewer and allow more in the decimal (though still not 15 depending on where you place the decimal/comma in the original string).
11 data test; 12 format CharVar $50. /* value_num1 value_num2 20.14 */; 13 CharVar="76632,293288184132475796"; 14 Value_num1 = input(tranwrd(CharVar,",","."), best32.); put Value_num1= 32.15; 15 Value_num2 = tranwrd(CharVar,",",".")*1; put Value_num2= 32.15; 16 Value_num3 = input(CharVar,commax32.); put Value_num3= 32.15; 17 run; NOTE: Character values have been converted to numeric values at the places given by: (Line):(Column). 15:18 Value_num1=76632.293288184100000 Value_num2=76632.293288184100000 Value_num3=76632.293288184100000
Count the digits in the about output, when you get to 15 everything after the "841" is 0.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hello @Shawnty,
Just to add a few details to what has been said already: Assuming SAS 9.4 under Windows or Unix, the internal 64-bit floating-point representation of your example value "7663229328,8184132475796", read with the NUMX32. informat, is
0100000111111100100011000011100110011001000011010001100000111000
(use put(input(CharVar,numx32.),binary64.) to get this and see Numerical Accuracy in SAS Software for the documentation).
- The green zero represents the positive sign of the number.
- The 11 red bits -- the (mathematical) binary representation of 1023+32=1055 -- store the order of magnitude, i.e., "where the decimal point is," more precisely: that the absolute value of the number is contained in the interval [2**32, 2**33).
- The 32 blue bits are needed to store the integer part of the number, 7663229328, whose binary representation is 111001000110000111001100110010000 (33 digits, the first digit is the so-called "implied bit," omitted in the internal representation).
- Only the 20 black bits are left to store a binary approximation of the fractional part, 0.8184132475796, whose exact binary representation is infinitely (!) long and starts with
0.1101000110000011100001111101010011110111101100000100000011000010000101100011001011010111001111001110...
So it must be rounded (down in this case). The least significant bit of the internal representation of your number has a place value of 2**-20 = 0.00000095367431640625 ≈ 1E-6. Converted back to the decimal system, the internal representation and its "neighbors" (i.e., one less and one more in the last bit, hence approx. -/+ 1E-6) correspond to
7663229328.81841182708740234375 7663229328.81841278076171875000 7663229328.81841373443603515625
You see that SAS chose the closest possible value to your number, given the limited space available (64 bits). The sixth decimal place (16th decimal digit overall) is not displayed with standard numeric SAS formats. That's why you got 7663229328.81841, not the more complete 7663229328.818413.
With 24 additional bits, i.e., 44 for the fractional part, SAS could have achieved a sufficient precision:
7663229328.81841324757954225788125768303871154785156250 7663229328.81841324757959910130011849105358123779296875 7663229328.81841324757965594471897929906845092773437500
Storing the fractional part in a separate numeric variable (the good idea suggested by @Astounding) would provide those additional bits and even a few more (a total of 52 mantissa bits plus the implied bit). Indeed, the internal representation of 0.8184132475796
0011111111101010001100000111000011111010100111101111011000001000
(note the "missing" implied bit and the additional 33 bits at the end compared to what we had when we included the space-consuming integer part)
corresponds to
0.81841324757959998947853819117881357669830322265625
which is a better approximation of the exact decimal value.