BookmarkSubscribeRSS Feed
essaykay
Calcite | Level 5

Hi,

 

I'm preparing for the Base SAS Programming certification and am currently reading chapter 17: reading free-format data (Third Edition). I tried executing the code snippet on page 543, in which the data step reads a range of numeric variables and specifies a format for it. I am working on SAS University Edition.

 

The following is my code snippet:

filename sf '/folders/myfolders/salesfmtd.txt';
data salesfmtd;
	infile sf;
	input Name $ (Sales1-Sales3) (7.);
run;

The salesfmtd.txt file has the following contents:

 

X 1500 1280 1800
Y 1260 1700.345 1900
Z 1600.076 1450 1720

 

The output data contains just one record as follows:

X . . .

i.e. the values are missing and represented by a period.

 

The log displays the following notes:

 

NOTE: Invalid data for Sales1 in line 1 3-9.
NOTE: Invalid data for Sales2 in line 1 10-16.
NOTE: Invalid data for Sales3 in line 2 1-7.
RULE: ----+----1----+----2----+----3----+----4----+----5----+----6----+----7----+----8----+----9----+----0
2 Y 1260 1700.345 1900 20
NOTE: Invalid data errors for file SF occurred outside the printed range.
NOTE: Increase available buffer lines with the INFILE n= option.
Name=X Sales1=. Sales2=. Sales3=. _ERROR_=1 _N_=1
NOTE: Invalid data for Sales2 in line 3 10-16.
NOTE: LOST CARD.
NOTE: Invalid data errors for file SF occurred outside the printed range.
NOTE: Increase available buffer lines with the INFILE n= option.
Name=Z Sales1=1600.07 Sales2=. Sales3=. _ERROR_=1 _N_=2
NOTE: 3 records were read from the infile SF.
The minimum record length was 16.
The maximum record length was 20.
NOTE: SAS went to a new line when INPUT statement reached past the end of a line.
NOTE: The data set WORK.SALESFMTD has 1 observations and 4 variables.
NOTE: DATA statement used (Total process time):
real time 0.02 seconds
cpu time 0.01 seconds
 
Can someone please tell me where the problem lies? I don't see why there could be a problem with specifying 7. as the format.
 
Thanks in advance!

 

 

4 REPLIES 4
mohamed_zaki
Barite | Level 11

Your range of variables are not character or formated. So you do not nedd to enclose them in parentheses.

This will work

input Name $ Sales1-Sales3 7.;
essaykay
Calcite | Level 5

Thanks for your reply, Zaki. I modified the code according to what you said, but I still didn't get the desired output.

Following are the log messages:

 

NOTE: Invalid data for Sales3 in line 2 1-7.
RULE: ----+----1----+----2----+----3----+----4----+----5----+----6----+----7----+----8----+----9----+----0
2 Y 1260 1700.345 1900 20
Name=X Sales1=1500 Sales2=1280 Sales3=. _ERROR_=1 _N_=1
NOTE: LOST CARD.
Name=Z Sales1=1600.076 Sales2=1450 Sales3=. _ERROR_=1 _N_=2
NOTE: 3 records were read from the infile SF.
The minimum record length was 16.
The maximum record length was 20.
NOTE: SAS went to a new line when INPUT statement reached past the end of a line.
NOTE: The data set WORK.SALESFMTD has 1 observations and 4 variables.
 
Following is the output:
X 1500 1280 .
 
Reeza
Super User
Remove the 7. entirely. You can add an informat if your data requires it but it doesn't.
FreelanceReinh
Jade | Level 19

As you're preparing for the certification, you should have come across the terms "formatted input" and "list input." Your raw data file seems to be space-delimited and the column widths vary between observations. This is a typical situation calling for list input. You can specify informats in the (list) INPUT statement (or, alternatively, in an INFORMAT statement) -- this is then called modified list input --, but the syntax for this requires the informat names to be prefixed with colons:

input Name $ (Sales1-Sales3) (:7.);

The above INPUT statement should work well with your data. All four variables are read using list input. Without the colon, Sales1 - Sales3 would be read using formatted input, which has implications on

  • where data is read and
  • where the pointer is located after reading a value.

(For details please see the documentation or, for another user's example, this older thread.)

 

As Reeza pointed out, there is actually no need to specify informat 7. here (the values are read correctly by default). Also, the width specification of an informat used in modified list input (not: formatted input) is ignored anyway in the reading process.* So, in your example you could even specify informat 1. (which looks as if it was way too short for your numeric raw data values) and nevertheless all the digits would be read correctly with modified list input.

 

* (But it does have an impact on the length of character variables if their length has not been set previously!)

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 4 replies
  • 1164 views
  • 3 likes
  • 4 in conversation