DATA Step, Macro, Functions and more

Log displays "Invalid data" when reading a range of variables using formatted input

Reply
New Contributor
Posts: 2

Log displays "Invalid data" when reading a range of variables using formatted input

[ Edited ]

Hi,

 

I'm preparing for the Base SAS Programming certification and am currently reading chapter 17: reading free-format data (Third Edition). I tried executing the code snippet on page 543, in which the data step reads a range of numeric variables and specifies a format for it. I am working on SAS University Edition.

 

The following is my code snippet:

filename sf '/folders/myfolders/salesfmtd.txt';
data salesfmtd;
	infile sf;
	input Name $ (Sales1-Sales3) (7.);
run;

The salesfmtd.txt file has the following contents:

 

X 1500 1280 1800
Y 1260 1700.345 1900
Z 1600.076 1450 1720

 

The output data contains just one record as follows:

X . . .

i.e. the values are missing and represented by a period.

 

The log displays the following notes:

 

NOTE: Invalid data for Sales1 in line 1 3-9.
NOTE: Invalid data for Sales2 in line 1 10-16.
NOTE: Invalid data for Sales3 in line 2 1-7.
RULE: ----+----1----+----2----+----3----+----4----+----5----+----6----+----7----+----8----+----9----+----0
2 Y 1260 1700.345 1900 20
NOTE: Invalid data errors for file SF occurred outside the printed range.
NOTE: Increase available buffer lines with the INFILE n= option.
Name=X Sales1=. Sales2=. Sales3=. _ERROR_=1 _N_=1
NOTE: Invalid data for Sales2 in line 3 10-16.
NOTE: LOST CARD.
NOTE: Invalid data errors for file SF occurred outside the printed range.
NOTE: Increase available buffer lines with the INFILE n= option.
Name=Z Sales1=1600.07 Sales2=. Sales3=. _ERROR_=1 _N_=2
NOTE: 3 records were read from the infile SF.
The minimum record length was 16.
The maximum record length was 20.
NOTE: SAS went to a new line when INPUT statement reached past the end of a line.
NOTE: The data set WORK.SALESFMTD has 1 observations and 4 variables.
NOTE: DATA statement used (Total process time):
real time 0.02 seconds
cpu time 0.01 seconds
 
Can someone please tell me where the problem lies? I don't see why there could be a problem with specifying 7. as the format.
 
Thanks in advance!

 

 

Super Contributor
Posts: 490

Re: Log displays "Invalid data" when reading a range of variables using formatted input

Your range of variables are not character or formated. So you do not nedd to enclose them in parentheses.

This will work

input Name $ Sales1-Sales3 7.;
New Contributor
Posts: 2

Re: Log displays "Invalid data" when reading a range of variables using formatted input

Posted in reply to mohamed_zaki

Thanks for your reply, Zaki. I modified the code according to what you said, but I still didn't get the desired output.

Following are the log messages:

 

NOTE: Invalid data for Sales3 in line 2 1-7.
RULE: ----+----1----+----2----+----3----+----4----+----5----+----6----+----7----+----8----+----9----+----0
2 Y 1260 1700.345 1900 20
Name=X Sales1=1500 Sales2=1280 Sales3=. _ERROR_=1 _N_=1
NOTE: LOST CARD.
Name=Z Sales1=1600.076 Sales2=1450 Sales3=. _ERROR_=1 _N_=2
NOTE: 3 records were read from the infile SF.
The minimum record length was 16.
The maximum record length was 20.
NOTE: SAS went to a new line when INPUT statement reached past the end of a line.
NOTE: The data set WORK.SALESFMTD has 1 observations and 4 variables.
 
Following is the output:
X 1500 1280 .
 
Super User
Posts: 19,785

Re: Log displays "Invalid data" when reading a range of variables using formatted input

Remove the 7. entirely. You can add an informat if your data requires it but it doesn't.
Trusted Advisor
Posts: 1,117

Re: Log displays "Invalid data" when reading a range of variables using formatted input

[ Edited ]

As you're preparing for the certification, you should have come across the terms "formatted input" and "list input." Your raw data file seems to be space-delimited and the column widths vary between observations. This is a typical situation calling for list input. You can specify informats in the (list) INPUT statement (or, alternatively, in an INFORMAT statement) -- this is then called modified list input --, but the syntax for this requires the informat names to be prefixed with colons:

input Name $ (Sales1-Sales3) (:7.);

The above INPUT statement should work well with your data. All four variables are read using list input. Without the colon, Sales1 - Sales3 would be read using formatted input, which has implications on

  • where data is read and
  • where the pointer is located after reading a value.

(For details please see the documentation or, for another user's example, this older thread.)

 

As Reeza pointed out, there is actually no need to specify informat 7. here (the values are read correctly by default). Also, the width specification of an informat used in modified list input (not: formatted input) is ignored anyway in the reading process.* So, in your example you could even specify informat 1. (which looks as if it was way too short for your numeric raw data values) and nevertheless all the digits would be read correctly with modified list input.

 

* (But it does have an impact on the length of character variables if their length has not been set previously!)

Ask a Question
Discussion stats
  • 4 replies
  • 297 views
  • 3 likes
  • 4 in conversation