Hi,
I'm trying to read a file from a text file, which has data separated by spaces. I'm having difficulty reading the 2nd column of numeric values. Some numbers do not have decimal but others have up to 15 decimal points.
See below for example of data:
Data:
2017-08-01 1
2017-07-01 0.996240902217826
2017-06-01 0.992495935251788
I am using the following data step to read this file:
Code:
data dataset1;
infile "&file_location./&file_name." dlm=" "
dsd missover firstobs=1;
input
data_Month : yymmdd10.
data_Value : comma17.15
;
format data_Month DDMMYY8.
data_Value comma17.15;
run;
Data read in "dataset1":
01/08/17 0.000000000000001
01/07/17 0.996240902217820
01/06/17 0.992495935251780
I am running into two problems:
Can you please guide me what I may be doing wrong?
You have two issues going on. First is when you place a format on an input statement the format will be applied to the data and treating the data a fixed column in the width specified. Which does not work well when the lengths of your variables changes.
Fix is a separate informat statement:
data dataset1; infile "&file_location./&file_name." dlm=" " dsd missover firstobs=1; informat data_Month yymmdd10. data_Value 17.15 ; input data_Month data_Value ;
Second is the limit of precision that SAS has for storing decimal values and 16 is the limit. but your informat only said to read 15 decimal places. So try an informat of 18.16. Note that SAS does not really treat commas in decimal portions as valid so no reason to use comma unless you are reading values greater than 999.
You have two issues going on. First is when you place a format on an input statement the format will be applied to the data and treating the data a fixed column in the width specified. Which does not work well when the lengths of your variables changes.
Fix is a separate informat statement:
data dataset1; infile "&file_location./&file_name." dlm=" " dsd missover firstobs=1; informat data_Month yymmdd10. data_Value 17.15 ; input data_Month data_Value ;
Second is the limit of precision that SAS has for storing decimal values and 16 is the limit. but your informat only said to read 15 decimal places. So try an informat of 18.16. Note that SAS does not really treat commas in decimal portions as valid so no reason to use comma unless you are reading values greater than 999.
The first question is an easy one. SAS knows how to read numbers. Just remove COMMA17.15 from the INPUT statement. Without going into the technical details, that's what changes the value on the first observation.
The second question is insurmountable. SAS is storing numerics in 8 bytes, and doing the best it can. Once you get up to around 15 significant digits, SAS doesn't have enough room to store the exact value and loses some precision. A sample program to illustrate:
data _null_;
x=0.996240902217826;
put x 17.15;
run;
Do NOT specify decimal places on an INFORMAT, unless you want SAS to insert an implied decimal point when none is supplied in the input stream. Try this example.
data test;
input x 5.2 ;
put x=Z5.2 ' Source=' _infile_;
cards;
123
2.3
;
Also in general never use MISSOVER, instead use the newer and improved TRUNCOVER option. With MISSOVER if you try to read 17 characters and the line only has 15 then you get a missing value. With TRUNCOVER SAS will just use the available characters. Your program is ok as is because you added the : modifier which will make SAS change the width of the informat to match the width of the available data stream.
Your numbers are changing because you have exceeded the maximum number of digits that can be guaranteed to be exactly represented in IEEE 64 bit floating point numbers.
75 data test; 76 x= constant('exactint') ; 77 put x comma23. ; 78 run; 9,007,199,254,740,992
If you want to read the data then use code like this and just undestand that some numbers will be truncated to fit.
data dataset1;
infile "&file_location./&file_name." truncover ;
input
data_Month : yymmdd.
data_Value : comma.
;
format
data_Month yymmdd10.
data_Value 17.15
;
run;
Don't use comma17.15 in INPUT, it will multiply 0.000000000000001 . data have; input x : $20. y : best32.; format y 32.18; cards; 2017-08-01 1 2017-07-01 0.996240902217826 2017-06-01 0.992495935251788 ; run;
Thank you all for your help!
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.