BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
asimraja
Fluorite | Level 6

Hi,

 

I'm trying to read a file from a text file, which has data separated by spaces. I'm having difficulty reading the 2nd column of numeric values. Some numbers do not have decimal but others have up to 15 decimal points.

See below for example of data:

 

Data:

2017-08-01 1
2017-07-01 0.996240902217826
2017-06-01 0.992495935251788

 

I am using the following data step to read this file:

 

Code:

data dataset1;
    infile "&file_location./&file_name." dlm=" "
        dsd missover firstobs=1;
    input
        data_Month : yymmdd10.
        data_Value : comma17.15
        ;
    format data_Month     DDMMYY8.
        data_Value comma17.15;
run;

 

Data read in "dataset1":

01/08/17    0.000000000000001
01/07/17    0.996240902217820
01/06/17    0.992495935251780

 

I am running into two problems:

  1. When the data value in 2nd column is just an integer, it is incorrectly read as a very small value. For example, "1" is read as " 0.000000000000001". How do I fix this?
  2. Also, for decimal values, the very last decimal point is not read. For example, the value "0.996240902217826" is read as "0.996240902217820"

Can you please guide me what I may be doing wrong?

1 ACCEPTED SOLUTION

Accepted Solutions
ballardw
Super User

You have two issues going on. First is when you place a format on an input statement the format will be applied to the data and treating the data a fixed column in the width specified. Which does not work well when the lengths of your variables changes.

Fix is a separate informat statement:

data dataset1;
    infile "&file_location./&file_name." dlm=" " 
        dsd missover firstobs=1;
    informat 
        data_Month  yymmdd10.
        data_Value  17.15
        ;
    input 
        data_Month
        data_Value
        ;

Second is the limit of precision that SAS has for storing decimal values and 16 is the limit. but your informat only said to read 15 decimal places. So try an informat of 18.16. Note that SAS does not really treat commas in decimal portions as valid so no reason to use comma unless you are reading values greater than 999.

 

View solution in original post

5 REPLIES 5
ballardw
Super User

You have two issues going on. First is when you place a format on an input statement the format will be applied to the data and treating the data a fixed column in the width specified. Which does not work well when the lengths of your variables changes.

Fix is a separate informat statement:

data dataset1;
    infile "&file_location./&file_name." dlm=" " 
        dsd missover firstobs=1;
    informat 
        data_Month  yymmdd10.
        data_Value  17.15
        ;
    input 
        data_Month
        data_Value
        ;

Second is the limit of precision that SAS has for storing decimal values and 16 is the limit. but your informat only said to read 15 decimal places. So try an informat of 18.16. Note that SAS does not really treat commas in decimal portions as valid so no reason to use comma unless you are reading values greater than 999.

 

Astounding
PROC Star

The first question is an easy one.  SAS knows how to read numbers.  Just remove COMMA17.15 from the INPUT statement.  Without going into the technical details, that's what changes the value on the first observation.

 

The second question is insurmountable.  SAS is storing numerics in 8 bytes, and doing the best it can.  Once you get up to around 15 significant digits, SAS doesn't have enough room to store the exact value and loses some precision.  A sample program to illustrate:

 

data _null_;

x=0.996240902217826;

put x 17.15;

run;

 

Tom
Super User Tom
Super User

Do NOT specify decimal places on an INFORMAT, unless you want SAS to insert an implied decimal point when none is supplied in the input stream.  Try this example.

data test;
  input x 5.2 ;
  put x=Z5.2 ' Source=' _infile_;
cards;
123
2.3
;

Also in general never use MISSOVER, instead use the newer and improved TRUNCOVER option.  With MISSOVER if you try to read 17 characters and the line only has 15 then you get a missing value. With TRUNCOVER SAS will just use the available characters. Your program is ok as is because you added the : modifier which will make SAS change the width of the informat to match the width of the available data stream.

 

Your numbers are changing because you have exceeded the maximum number of digits that can be guaranteed to be exactly represented in IEEE 64 bit floating point numbers.

75    data test;
76       x= constant('exactint') ;
77       put x comma23. ;
78    run;

  9,007,199,254,740,992

If you want to read the data then use code like this and just undestand that  some numbers will be truncated to fit.

data dataset1;
    infile "&file_location./&file_name." truncover ;
    input 
        data_Month : yymmdd.
        data_Value : comma.
    ;
    format 
        data_Month yymmdd10.
        data_Value 17.15
    ;
run;
Ksharp
Super User
Don't use comma17.15 in INPUT, it will multiply 0.000000000000001 .




data have;
input x : $20. y : best32.;
format y 32.18;
cards;
2017-08-01 1
2017-07-01 0.996240902217826
2017-06-01 0.992495935251788
;
run;

asimraja
Fluorite | Level 6

Thank you all for your help!

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 5 replies
  • 4713 views
  • 0 likes
  • 5 in conversation