DATA Step, Macro, Functions and more

eof= option and number of observations output

Accepted Solution Solved
Reply
Super Contributor
Posts: 282
Accepted Solution

eof= option and number of observations output

Hi all,

We are using SAS 9.1.3 on Mainframe z/OS.

I expected the following simplified code to produce 3 output observations, but instead it produces 4, the last of which is blank:

data test;

  infile datalines eof=no_more_data;

  input text $char3.;

  return;

  no_more_data:

    putlog 'Finished';

    *stop;

  datalines;

AAA

BBB

CCC

;

Uncommenting the stop statement creates the desired output of 3 observations.

Inserting an extra return statement immediately before the datalines statement still produces 4 output observations.

I have the feeling it something to do with the use of the eof= option along with the labelled section, but I couldn't find any evidence to back this up in the documentation or the web.

Does anyone have an explanation for this behaviour, or have I just missed something?

Thanks & regards,

Amir.


Accepted Solutions
Solution
‎07-12-2012 09:24 AM
Respected Advisor
Posts: 3,799

Re: eof= option and number of observations output

As written your data step includes an implied RETURN; as the last statement before cards.  When control is transferred to the statement label, statement(s) are executed until a RETURN implied or explicit.  Or STOP of course, but STOP is very strong medicine indeed.   In the absence of explicit OUTPUT, OUTPUT is implied with RETURN.

So what did you expect and what do you want to happen?


View solution in original post


All Replies
Solution
‎07-12-2012 09:24 AM
Respected Advisor
Posts: 3,799

Re: eof= option and number of observations output

As written your data step includes an implied RETURN; as the last statement before cards.  When control is transferred to the statement label, statement(s) are executed until a RETURN implied or explicit.  Or STOP of course, but STOP is very strong medicine indeed.   In the absence of explicit OUTPUT, OUTPUT is implied with RETURN.

So what did you expect and what do you want to happen?


Super Contributor
Posts: 282

Re: eof= option and number of observations output

Posted in reply to data_null__

Hi all,

Thank you all for taking the time to respond. I would say you have confirmed my suspicions about what was happening in the data step, and I was glad to hear the suggestions on the use of an explicitly coded output statement as this was one of the things I had tried. Another thing I tried was a delete statement in place of the stop, which also gave 3 output observations, although it's use might be somewhat puzzling at first sight.

data _null_: In answer to your questions: "So what did you expect..." I expected 3 observations to be output; "...and what do you want to happen?" I wanted 3 observations to be output.

Thanks & regards,

Amir.

Occasional Contributor
Posts: 5

Re: eof= option and number of observations output

Delete works by accident here - by the time you hit it, you're already on the fourth iteration of the datastep and so the required records have already been output. So it deletes the non existent record... and as you're at end of file, the datastep terminates.

Occasional Contributor
Posts: 15

Re: eof= option and number of observations output

Amir

Wouldn't it be better to use the end= option like this:

data test;

   infile datalines end=no_more_data;

   input text $char3.;

   if no_more_data then putlog 'Finished';

datalines;

AAA

BBB

CCC

;

Respected Advisor
Posts: 3,799

Re: eof= option and number of observations output

Posted in reply to ChrisSelley

It don't work.

2443  data test;

2444     infile datalines end=no_more_data;

WARNING: The value of the INFILE END= option cannot be set for CARDS or DATALINES input.

2445     input text $char3.;

2446     if no_more_data then putlog 'Finished';

2447  datalines;


Occasional Contributor
Posts: 15

Re: eof= option and number of observations output

Posted in reply to data_null__

That'll teach me to try it first!

Occasional Contributor
Posts: 5

Re: eof= option and number of observations output

You need to understand what's happening here.

SAS can't process the EOF statement without going through the data step a fourth time. Cos it identifies EOF on the fourth iteration of the input statement when it finds nothing more to read.

So the implicit output statement referred to in the previous answers will be executed a fourth time, unless you specifically prevent it.

Stop is part of the language to terminate the datastep when automatic termination isn't appropriate. This is one time when stop is appropriate.

Return won't do it, cos it'll just send the processing back to the command after the call to no_more_data.

An alternative is to use an explicit output statement before the return, this will supress the automatic/implied output statement at the end of the datastep and as it's after the link to 'no_more_data' and there's no return after the label, it will only be executed three times, once for each input record found. There isn't an implicit return statement here. SAS just drops out of the datastep once eof is reached. 

As a matter of clarity, always use return statements at the end of each link section and at the end of the code before the labels. 

not sure why 'Stop is very strong medicine indeed', unless it's being mixed up with endsas.

So your code should be:

data test;

  /* better/clearer */

  infile datalines eof=no_more_data;

  input text $char3.;

  return;

  no_more_data:

    putlog 'Finished';

    stop;

  return;

  datalines;

AAA

BBB

CCC

;

OR:

data test;

  /* will work */

  infile datalines eof=no_more_data;

  input text $char3.;

  output;

  return;

  no_more_data:

    putlog 'Finished';

  return;

  datalines;

AAA

BBB

CCC

;


Super User
Posts: 5,516

Re: eof= option and number of observations output

Amir,

It sounds like you're looking for an explanation, so here's mine.

The normal end to a DATA step is to have the INPUT statement fail because it tries to read an extra line of data that doesn't exist.  You can demonstrate that by adding this line both before and after the INPUT statement:

put _all_;

Adding EOF= changes that behavior.  Instead of ending the DATA step when the INPUT statement fails, control is passed to the labeled section.  The labeled section executes (again, try adding _all_ within the labeled PUT statement).  However, the DATA step does not halt as it usually would.  There is no failed INPUT statement.  Instead, the fourth observation gets output (you can verify this in part by adding a RETAIN ID statement and seeing how the results change).  Then the DATA step halts as the software "figures out" that it would have to loop forever if it were to continue.

That's my story and I'm sticking to it.

Good luck.

🔒 This topic is solved and locked.

Need further help from the community? Please ask a new question.

Discussion stats
  • 8 replies
  • 2304 views
  • 6 likes
  • 5 in conversation