- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
in my work I have strings that contain only digits. Very long strings up to 32 characters.
If I use the function input() to convert from character to numeric, but with strings longer than 16 characters the numeric result is visibly wrong. I include the code explaining the problem, and the printout of the results. Can anyone explain to me where am I wrong?
DATA TEST_INPUT ;
INPUT CHAR_TEXT $32. ;
DATALINES;
3659218487881310922172417
3662444754643708821993473
;
RUN ;
DATA TEST_INPUT2 ;
SET TEST_INPUT ;
FORMAT CHAR_2_NUM 32. ;
FORMAT CHAR_2_NUM_16_DIGIT 32. ;
FORMAT CHAR_2_NUM_17_DIGIT 32.;
CHAR_2_NUM = INPUT(CHAR_TEXT,32.);
CHAR_2_NUM_16_DIGIT =INPUT(SUBSTR(CHAR_TEXT,1,16),32.);
CHAR_2_NUM_17_DIGIT =INPUT(SUBSTR(CHAR_TEXT,1,17),32.);
RUN ;
NUMB_TEXT |
CHAR_2_NUM |
CHAR_2_NUM_16_DIGIT |
CHAR_2_NUM_17_DIGIT |
3659218487881310922172417 |
3659218487881310752735232 |
3659218487881310 |
36592184878813112 |
3662444754643708821993473 |
3662444754643709451042816 |
3662444754643708 |
36624447546437088 |
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
https://documentation.sas.com/doc/en/pgmsascdc/9.4_3.5/hostunx/p12zsdbylnn6c2n1i48z7djr6uzo.htm
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
As I recall 2 to the 52nd power (2**52) or 4,503,599,627,370,496 is the largest number the current release of SAS can contain without loss of precision. SAS is all double precision floating point binary in terms of numeric representation. It doesn't have BIGINT or DECIMAL or other formats that might handle larger numbers.
Jim
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
What are you measuring or counting that requires 32 significant digits?
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
@prizzo60 wrote:
I agree , it is very strange number with 32 significant digits, but this number is a "serial number" deriving from contract in stock market.
If you aren't doing arithmetic with the variable it should be character.
Any sequential behavior of a "serial number" would maintain an you won't lose any information.
Account numbers, addresses, phone numbers, part numbers: character.
Height , weight, currency values, counts or measurements with units: numeric.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
If you need specific portions for identifiers then you can parse them out into separate fields.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
@prizzo60 wrote:
I agree , it is very strange number with 32 significant digits, but this number is a "serial number" deriving from contract in stock market.
Then it will never be used in calculations, so store it as a character string. Just like we do with our policy "numbers".
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
The only thing I would add to all the comments to keep this variable in character format is to make sure that the value is right-justified. I.e. any value with fewer characters should have leading blanks not trailing blanks. This will preserve a sort order identical to what would be the case if you had sufficiently precise numeric values.
Or you might insert leading zeroes to eliminate blanks.
Added note: If you don't right justify and allow trailing blanks, then sorting on these values will produce lexicographic ordering (i.e. '1111' is less than '22 '), rather than ordering by the implicit value (i.e. ' 22' is less than '1111').
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set
Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets
--------------------------
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hello @prizzo60,
To store your example 25-digit strings (or even 32-digit strings) in numeric variables without losing precision, SAS under Windows or Unix would need 81 (106) mantissa bits -- i.e., 29 (54) more than are available.
See Re: Strange behavior of long int in SAS for explanations and additional links.
I would store these "numbers" in a character variable of sufficient length (and look forward to coding the algorithms needed to perform calculations with them, if any).