BookmarkSubscribeRSS Feed
geneshackman
Pyrite | Level 9

If I am reading from raw data, using column input, that is, saying var1 is in column 2-3, var2 in column 4-5 and so on, like this

 

DATA b;

INFILE b lrecl=1501 truncover;

INPUT

@1 IPOB $CHAR5.

@6 TPOB $CHAR1.

@7 RCD $CHAR4.

@11 PFI $CHAR4.

and so on

 

What's the purpose of truncover? If PFI starts at column 11, and is 4 columns, but some records only have data in two columns, I understand that if I have truncover in the infile, that tells SAS to read whatever data is there, in those four columns, and use it. But if I don't have truncover (or missover), I do have column identifiers. Will SAS still try to go to the next variable or next line or next set of columns or next something to try to read data for that variable, pfi? Aren't I telling SAS what columns to use, so why would SAS try to get data from other columns?

 

Basically, if I use column input, what does truncover or missover do?

 

Thanks

 

12 REPLIES 12
DartRodrigo
Lapis Lazuli | Level 10

Hi mate,

If you only need the explanation about missover and truncover go to Making Sense of the INFILE and INPUT Statements.

But if you need to read raw files there are a many ways to round this.

 

One of these ways are setup one large string variable that has the entire row.

This entire row has all the subcolumns in it.

 

Sintax =  input entire_row $200.;

 

and then use many ifs to surround each piece of the raw file.

 

Or you can use the input statement features, for that checkout Reading Raw Files.

 

Hope this helps

geneshackman
Pyrite | Level 9

Thanks for the note. My question, though, was about a very specific thing, Basically, if I use column input, what does truncover or missover do?.  I read lots of on line info, and saw the difference between missover and truncover (everything is missing vs get whatever data are there), and I understand that both will prevent SAS from going to the next line if the last variable in the data set is variable length. But I had the very specific question, mentioned above, that I can't seem to find an answer to.

DartRodrigo
Lapis Lazuli | Level 10

Checkout the What is the difference between Missover and Truncover?

 

Although it is subtle, the difference is there:

 

/*

use notepad to save the following lines

as a text file labeled emplist.txt

 

 

LANGKAMM SARAH E0045 Mechanic

TORRES JAN E0029 Pilot

SMITH MICHAEL E0065

LEISTNER COLIN E0116 Mechanic

TOMAS HARALD

WADE KIRSTEN E0126 Pilot

WAUGH TIM E0204 Pilot

 

then run the following:

*/

DATA test1;

  INFILE "c:\emplist.txt" missover;

  INPUT lastn $1-21 Firstn $ 22-31

   Empid $32-36 Jobcode $37-45;

RUN;

DATA test2;

  INFILE "c:\emplist.txt" truncover;

  INPUT lastn $1-21 Firstn $ 22-31

   Empid $32-36 Jobcode $37-45;

RUN;

 

You can find an explanation in the paper at:

http://www2.sas.com/proceedings/sugi26/p009-26.pdf

 

HTH,

Art

geneshackman
Pyrite | Level 9

Thanks. Again, though, I am not looking for an explanation of the difference between truncover and missover. I looked on the web and got the difference. I am looking for an answer to a very specific question, which I have not found yet.

ballardw
Super User

Since the "very specific question" I see in your post is:

"Basically, if I use column input, what does truncover or missover do?"

 

Then I am failing to understand how a reference to how turncover and missover work is not sufficient.

TRUNCOVER and MISSOVER both relate to the total line length and number of variables, not the length of any specific variable.

 

However you may be having problems reading your data because of the way SAS works when you write your input statement with formats specified. When the format is part of the input statement the values start reading at the current position, 11 for PFI and reads 4 columns. If there are 2 of intended data then a space and then the next variable PFI will read 4 characters and the last will be the character in column 14.

If you want to use list input if usually works better to specify the informats and then input, especially if your data is delimited (list input assumes a space is the delimiter).

Example:

DATA b;
   INFILE b lrecl=1501 truncover; 
      informat
      IPOB $CHAR5.
      TPOB $CHAR1.
      RCD $CHAR4.
      PFI $CHAR4.
   ;
   INPUT
      IPOB 
      TPOB 
      RCD 
      PFI 
   ;
run;

geneshackman
Pyrite | Level 9

Thanks for the response. Perhaps I should ask, if I specify the columns for the data, are truncover or missover needed? As I understand it, if you just list the variables in the input statement, without indicating columns, and if there are any missing data, SAS will go to the next non missing numbers or next line to try to fill in the variable. Truncover and missover are attempts to prevent that, basically to tell SAS if you get to the end of the line and there isn't enough data, stop there. Do not go to the next line. Truncover and missover are slightly different ways of telling SAS to not go to the next line, if there aren't enough characters in the last variable, and the input statement does not specify columns for the variable.

 

However, that is not my case. I do specify what columns to use in my input statement. I am writing, input, varnames, varcolumns. So I am not asking how truncover or missover differ in handling the situation when there aren't enough characters in the last variable, and the input statement does not include columns for the variable.

 

I am asking, again, if I do specify the columns in the input statement, what is the purpose of truncover or missover. I am telling SAS what columns to use for the variables, since I am specifying columns in the input statement. It seems logical to me that if I do specify the columns in the input statement, then SAS should not be looking in any other columns, regardless of whether the columns specified have data or are blank.

 

Or, are you saying that even if I do specify the columns in the input statement, if there aren't enough characters or digits in those columns, SAS will go on to the next columns to try to fill the data?

ballardw
Super User

@geneshackman wrote:

 

I am asking, again, if I do specify the columns in the input statement, what is the purpose of truncover or missover. I am telling SAS what columns to use for the variables, since I am specifying columns in the input statement. It seems logical to me that if I do specify the columns in the input statement, then SAS should not be looking in any other columns, regardless of whether the columns specified have data or are blank.


When you specify the columns and you specify a column that is not there then the Truncover/Missover/Flowover options come into play.

When you say to read at column 456 and your input data file does not have 456 columns that kicks in the behavior. If your datafile is perfect then the option won't mean much (though if you specify columns past the LRECL you'll have issues).

Heres a stupid example of reading 30 one character variables at columns 1 to 30 and the file looks like this:

AAAAAAAAAAAAAAAAAAAAAAAAAAAAAA

BBBBBBB

CCCCCCCCCCCCCCCCCCCCCCCCCCCC

@IF you file structure has an end of line after the 7th B and you have an INPUT @8 V8 then you're going to have an issue and the options allow you to specify what to do. The default will attempt to read from the C line Truncover and Missover

 

There may be chance that are confusing your declared line length (LRECL) with actual line length(number of characters before end of line marker). The option PAD will add blanks to the end of a logic line to complete the LRECL with the result that variables read from past the logical length see those blanks.

geneshackman
Pyrite | Level 9

You mean, if I have this:

DATA b;
 INFILE b truncover;
INPUT
var1 1-4
var2 5-6
var3 8-12

then, if I don't have truncover (or missover), if one record happens to be missing values for var3 (while other records do have values for var3), even though I tell SAS to use specific columns, then SAS will try to go to the next record and will look in some other columns to try to fill in values for var3? That seems unlogical for SAS to do that. But if that's what SAS does, then I would need truncover.

Haikuo
Onyx | Level 15

Ok, I feel your frustration, and I am telling you, myself been there done that. Although you have some confusion conceptually on 'Formated input' vs 'Column input', but that does not comprise the momentum of your question. Let me answer it quick, and then I will try to explain it using some code. In short, SAS does NOT care your column settings at the end of the line, if there is nothing there, SAS WILL move on to the next line by default, regardless what kind of input method you use, including 'Column input', until you tell SAS not to do it.

 


filename FT15F001 temp; 
data have;
/*infile FT15F001 truncover  ;*/ 
/*will read and keep, will not move on to the next line*/
/*infile FT15F001 missover   ;*/
/*will not read the last variable if it is partial, will not move on to the next line*/
/*infile FT15F001 pad ;*/
/*this is to pad blanks to the end of the line to 80 chars so SAS can still respect your column instruction*/

infile FT15F001 ;
/*will flower over, the default behavior*/
input v1 $ 1-2 v2 $ 4-30;
   parmcards; 
as abcd
as asdlfk
;;;;
run;
Tom
Super User Tom
Super User

I am asking, again, if I do specify the columns in the input statement, what is the purpose of truncover or missover. I am telling SAS what columns to use for the variables, since I am specifying columns in the input statement. It seems logical to me that if I do specify the columns in the input statement, then SAS should not be looking in any other columns, regardless of whether the columns specified have data or are blank.

 

Or, are you saying that even if I do specify the columns in the input statement, if there aren't enough characters or digits in those columns, SAS will go on to the next columns to try to fill the data?


 

At this point it isn't realy a question of what is logical anymore.  It is just a question of what does it do.  You can test the behavior if you are confused about how it will behave. SAS was originally implemented over 50 years ago back in the 1960's and 70's.  They go to great effort when adding new features to not break old prgrams.  If you ask SAS to read the 10th column from a line that does NOT have a 10th column the default behaviour (FLOW option) is to go to the next line.

 

You can prevent this by specifying TRUNCOVER.  You could also use MISSOVER, but that introduces other strange things.

You could also prevent it by making sure that your LRECL is larger than the column you are trying to read and setting the PAD option so that there will be actual spaces in the column for the INPUT statement to read.

geneshackman
Pyrite | Level 9

Okay, thanks all for the responses. So what SAS does doesn't seem logical to me, but that's what it does. Using truncover is a must.

ballardw
Super User

I would say that knowing your file format is the "must" and be thankful for tools that allow reading not quite perfect or variable data layout.

 

 

 

 

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 12 replies
  • 2051 views
  • 4 likes
  • 5 in conversation