DATA Step, Macro, Functions and more

Difficulty with Informat

Accepted Solution Solved
Reply
Contributor
Posts: 39
Accepted Solution

Difficulty with Informat

Hi,

 

I'm trying to read a file from a text file, which has data separated by spaces. I'm having difficulty reading the 2nd column of numeric values. Some numbers do not have decimal but others have up to 15 decimal points.

See below for example of data:

 

Data:

2017-08-01 1
2017-07-01 0.996240902217826
2017-06-01 0.992495935251788

 

I am using the following data step to read this file:

 

Code:

data dataset1;
    infile "&file_location./&file_name." dlm=" "
        dsd missover firstobs=1;
    input
        data_Month : yymmdd10.
        data_Value : comma17.15
        ;
    format data_Month     DDMMYY8.
        data_Value comma17.15;
run;

 

Data read in "dataset1":

01/08/17    0.000000000000001
01/07/17    0.996240902217820
01/06/17    0.992495935251780

 

I am running into two problems:

  1. When the data value in 2nd column is just an integer, it is incorrectly read as a very small value. For example, "1" is read as " 0.000000000000001". How do I fix this?
  2. Also, for decimal values, the very last decimal point is not read. For example, the value "0.996240902217826" is read as "0.996240902217820"

Can you please guide me what I may be doing wrong?


Accepted Solutions
Solution
‎09-19-2017 04:55 PM
Super User
Posts: 13,321

Re: Difficulty with Informat

You have two issues going on. First is when you place a format on an input statement the format will be applied to the data and treating the data a fixed column in the width specified. Which does not work well when the lengths of your variables changes.

Fix is a separate informat statement:

data dataset1;
    infile "&file_location./&file_name." dlm=" " 
        dsd missover firstobs=1;
    informat 
        data_Month  yymmdd10.
        data_Value  17.15
        ;
    input 
        data_Month
        data_Value
        ;

Second is the limit of precision that SAS has for storing decimal values and 16 is the limit. but your informat only said to read 15 decimal places. So try an informat of 18.16. Note that SAS does not really treat commas in decimal portions as valid so no reason to use comma unless you are reading values greater than 999.

 

View solution in original post


All Replies
Solution
‎09-19-2017 04:55 PM
Super User
Posts: 13,321

Re: Difficulty with Informat

You have two issues going on. First is when you place a format on an input statement the format will be applied to the data and treating the data a fixed column in the width specified. Which does not work well when the lengths of your variables changes.

Fix is a separate informat statement:

data dataset1;
    infile "&file_location./&file_name." dlm=" " 
        dsd missover firstobs=1;
    informat 
        data_Month  yymmdd10.
        data_Value  17.15
        ;
    input 
        data_Month
        data_Value
        ;

Second is the limit of precision that SAS has for storing decimal values and 16 is the limit. but your informat only said to read 15 decimal places. So try an informat of 18.16. Note that SAS does not really treat commas in decimal portions as valid so no reason to use comma unless you are reading values greater than 999.

 

Super User
Posts: 6,631

Re: Difficulty with Informat

[ Edited ]

The first question is an easy one.  SAS knows how to read numbers.  Just remove COMMA17.15 from the INPUT statement.  Without going into the technical details, that's what changes the value on the first observation.

 

The second question is insurmountable.  SAS is storing numerics in 8 bytes, and doing the best it can.  Once you get up to around 15 significant digits, SAS doesn't have enough room to store the exact value and loses some precision.  A sample program to illustrate:

 

data _null_;

x=0.996240902217826;

put x 17.15;

run;

 

Super User
Super User
Posts: 7,933

Re: Difficulty with Informat

[ Edited ]

Do NOT specify decimal places on an INFORMAT, unless you want SAS to insert an implied decimal point when none is supplied in the input stream.  Try this example.

data test;
  input x 5.2 ;
  put x=Z5.2 ' Source=' _infile_;
cards;
123
2.3
;

Also in general never use MISSOVER, instead use the newer and improved TRUNCOVER option.  With MISSOVER if you try to read 17 characters and the line only has 15 then you get a missing value. With TRUNCOVER SAS will just use the available characters. Your program is ok as is because you added the : modifier which will make SAS change the width of the informat to match the width of the available data stream.

 

Your numbers are changing because you have exceeded the maximum number of digits that can be guaranteed to be exactly represented in IEEE 64 bit floating point numbers.

75    data test;
76       x= constant('exactint') ;
77       put x comma23. ;
78    run;

  9,007,199,254,740,992

If you want to read the data then use code like this and just undestand that  some numbers will be truncated to fit.

data dataset1;
    infile "&file_location./&file_name." truncover ;
    input 
        data_Month : yymmdd.
        data_Value : comma.
    ;
    format 
        data_Month yymmdd10.
        data_Value 17.15
    ;
run;
Super User
Posts: 10,689

Re: Difficulty with Informat

Don't use comma17.15 in INPUT, it will multiply 0.000000000000001 .




data have;
input x : $20. y : best32.;
format y 32.18;
cards;
2017-08-01 1
2017-07-01 0.996240902217826
2017-06-01 0.992495935251788
;
run;

Contributor
Posts: 39

Re: Difficulty with Informat

Thank you all for your help!

☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 5 replies
  • 157 views
  • 0 likes
  • 5 in conversation