DATA Step, Macro, Functions and more

Input statement: informat question

Reply
Frequent Contributor
Posts: 100

Input statement: informat question

Greetings:
Got confused here while learning how informat works in reading raw data files. Could any expert give a hint?

Raw data in a txt file:
NewYork 12
LA 24

Here is what I tried first.
data work.ny;
infile ny;
input city $7. visit;
run;
This landed me with only the first row plus an error message:
SAS went to a new line when INPUT statement reached past the end of a line.

Could someone help me to understand why would SAS reach past the end of the first line at all? I know if "LA" is "Seattle" (a longer value), my code will work fine so I don't think the first ob is the problem. To stop SAS reaching past the end, I tried MISSOVER. This did bring in a second row with values missing. In the end, I had to use a colon modifier.

My SAS book says "The informat in modified list input determines only the length of the variable, not the number of columns that are read. " But in this case if I don't use an informat for city, the original code will work fine so I'm puzzled.

Thank you for your time!
PROC Star
Posts: 7,363

Re: Input statement: informat question

You're just a little confused. You can find a nice description at:
http://support.sas.com/publishing/pubcat/chaps/58369.pdf

The problem is that you are NOT using the method the statement describes. For modified list input you would have to include a colon in your input statement. e.g.:


input city : $7. visit;

Otherwise, yes, SAS will read all seven characters for city, regardless of whether they include imbeded spaces or not. If you also have imbeded spaces, then you would also have to include the ampersand modifier.

HTH,
Art
Frequent Contributor
Posts: 100

Re: Input statement: informat question

Thanks to you both, Art and Ksharp. The recommended file is very easy to follow.
Just to confirm:

My SAS Certification book says 'The colon (Smiley Happy modifier is used to read nonstandard data values and character values that are longer than eight characters, but which contain no embedded blanks.' It almost makes me think if character values are shorter than 8 bytes and do not have embedded blanks, the modifiers are not necessary.

Now after our earlier discussion, it seems that I have to use : or & so SAS understands I'm not trying to read data with formatted input? I guess it's more of an issue when the variable starts from column 1 otherwise by looking at the pointer portion, it's not hard to tell input method.

Many thanks!
Frequent Contributor
Posts: 100

Re: Input statement: informat question

Tested more and confirmed the answer to my previous post should be yes. Thanks for the learning.
Super User
Posts: 9,681

Re: Input statement: informat question

>if character values are shorter than 8 bytes and do not have embedded blanks, the modifiers are not necessary.

That is only suitable for list input .that is mean list input ' input a $ ' is the same as modified input ' input a : $8.' .Once you use formatted then maybe you would use colon as Art mentioned ' input a : $7.'.
Frequent Contributor
Posts: 100

Re: Input statement: informat question

Thanks for the additional explanation! I noticed that whenever I use informat for character variable in the input statement, SAS will treat it as formatted input, unless I use colon. It makes a real difference when my values have different length.

Here's my quick test:
Raw data:
Andy Lee 150
Adam Jack 200
Mary Jacob 300

When I test Input Firstname$ Lastname $5. Hours; the lastname for the first row shows Lee 1 (because SAS grabbed 5 characters) and the value for hours = 50, even though I "think" my code is using modified list input and I was hoping SAS stops reading for my lastname field when it encounters the space after Lee. In short, have to use the colon to get the data in right. Hope this helps the next puzzled student.
SAS Super FREQ
Posts: 8,743

Re: Input statement: informat question

Hi:
There is another way to read your data, using simple list input (not formatted input), as described here:
http://support.sas.com/documentation/cdl/en/basess/58133/HTML/default/a001066690.htm

Although you can use mixed input types, when you used $5. to read LASTNAME, you changed from list input to "formatted input", as described here:
http://support.sas.com/documentation/cdl/en/basess/58133/HTML/default/a001052077.htm

Note that, as it says at the end of the topic -- SAS read formatted input until it has read the number of positions specified by the INFORMAT. In your case, $5. was considered to be the INFORMAT for the LASTNAME field.

So when you had
[pre]
input firstname $ lastname $5. ... ;
[/pre]

You started with simple list input for FIRSTNAME, and then, switched to formatted input for LASTNAME. I'm not sure how you read HOURS -- with formatted or list input.

You can use a LENGTH statement with simple list input to specify the maximum length for a character variable that you are going to read with list input. If you specify the length, then list input will read until it hits a delimiter (the default delimiter for list input is a space or blank -- although, you can change it with the DLM option).

The program below reads your data without using the colon modifier.

cynthia
[pre]
data hourdata;
length firstname $8 lastname $15;
infile datalines;
input firstname $ lastname $ hours;
return;
datalines;
Andy Lee 150
Adam Jack 200
Mary Jacob 300
John Jingleheimer 400
;
run;

ods listing;
proc contents data=hourdata;
title 'PROC CONTENTS';
run;

proc print data=hourdata;
title 'PROC PRINT';
run;
[/pre]
Super User
Posts: 9,681

Re: Input statement: informat question

>whenever I use informat for character variable in the input statement, SAS will treat it as formatted input, unless I use colon. It makes a real difference when my values have different length.


Yes. you are right. But the length of variable is the length you defined in format, that is not different length when you use colon in your input statement.There is an important thing you need to remember(i.e. when you use length statement and informat statement before input statement, input method is identical with colon input method just as Cynthia mentioned),and once the character varible enters the PDV ,its length will not allow to change afer data step.

Now let's take a look at your example.
In your code ,'hours'is list input, 'Firstname' is list input(which has eight length sas default),'Lastname' is formatted input ( which will ignore the delimiter such as blank, and input until the fifth character, So you will get 'Lee 1' not 'Lee', you should add colon before $5. such as : $5.), The colon in ' : $5. ' will stop read the data when encounter delimiter ( blank and so on),but the length of variable is still 5.

Hope this will help you a little bit.
Cynthia gives some value reference about it.



Ksharp Message was edited by: Ksharp
Frequent Contributor
Posts: 100

Re: Input statement: informat question

Cynthia and Ksharp:
Thank you so much for all the detailed clarification!
I tested your suggestions and they all make sense to me now.I'm glad you are here for the beginners Smiley Happy.
Contributor
Posts: 24

Re: Input statement: informat question

Hi mnew

The problem is that you are confusing list and formatted inputs. You never use informats with list input!

The only exception is by using : when character variable is greater than 8 or numeric variable is non-standard.

Second, never use format input with free format data. Use it only with fixed field data. From the very beginning you are trying to use format input for reading the free format data.

Third, to avoid SAS jumping to next line because of small record length, use TRUNCOVER option in the infile statement.
Super User
Posts: 9,681

Re: Input statement: informat question

Hi.
In SAS ,there are four input method : list input, formatted input, column input , named input.
The difference you refer to is between list input and formatted input.Art has some details for it.
So If you understand these four input way,then will process complicated data perfently.




Ksharp Message was edited by: Ksharp
Ask a Question
Discussion stats
  • 10 replies
  • 188 views
  • 0 likes
  • 5 in conversation