BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Steve1964
Obsidian | Level 7

The stop statement is needed in the following data step:

data _null_;
length mname $ 32;
do mname = 'erase.dat', 'erase2.dat';
infile 'c:\mydir\tmp' memvar=mname end=done;
do while (^done);
input x;
put _all_;
end;
end;
stop;
run;

 

the stop statement is not needed in another data step as following:

data _null_;
length memname $ 1024;
memname = ' ';
infile 'c:\mydir\tmp' memvar=memname end=done;
put memname=;
do while (^done);
input x;
put _all_;
end;
run;

 

Why?

 

1 ACCEPTED SOLUTION

Accepted Solutions
7 REPLIES 7
DanielLangley
Quartz | Level 8

A datastep has an implicit loop in it. If it can detect another row in a table it will take in the values from the next row and repeat all of the instructions in the datastep again.

 

In your first example your implicit loop has been replaced with an explicit while loop. As that is the case we do not want it to use the implicit loop and must forcibly stop it.

 

In the second example we want it to read in the data line by line and are taking advantage of the implicit loop so we don't use a stop statement.

Steve1964
Obsidian | Level 7

The key point is, the first data step without stop statement run a dead loop, but the second one does not! 

Astounding
PROC Star
I suspect that the second DATA step would benefit from a STOP statement (whether necessary or not). Aren't you getting a message in the log about the DATA step ending due to looping? SAS begins a second iteration of the DATA step, and halts at the bottom when that iteration does not read from any source of data. At least, that's what I think would happen. A STOP statement would eliminate that message as well as the extra work that causes it.
DanielLangley
Quartz | Level 8

I believe astounding is correct here. I didn't fully read the second datastep.

hashman
Ammonite | Level 13

@Steve1964:

A short answer:

  • Without STOP in the first step, it will go into an infinite loop, endlessly re-reading the two files in the directory each time anew.
  • The second step without STOP will auto-end after program control has passed from the top to the bottom of the step without having read anything from any input source (this is known as stopping the step "due to looping").

 

A longer answer:

  • The first step is constructed in such a way that it gives no chance for program control to go from the top to the bottom of the DATA step without reading anything from any input source.
  • The second step affords that chance.

 

A long answer:

  • Step #1:
  1. After file 1 has been fully read, DONE=1.
  2. Program control goes to the top of the outer DO loop, and MNAME is set to the name of file2.
  3. The INFILE statement is executed again, setting DONE=0.
  4. On the last record of file 2, the INPUT statement sets DONE=1.
  5. Program control exits the outer DO loop and goes to the top of the step.
  6. It reenters the outer DO loop and sets MNAME to the name of file1 again. 
  7. The INFILE statement is executed and resets DONE=0. THIS IS KEY.   
  8. Hence, file 1 is read again, then file 2, then control flow returns to #5.

Result: Because of #7, every time program control passes from the top to the bottom of the step, it reads something from the input files. Thus, the DATA step never detects the situation whereby a full top-bottom execution results in reading nothing; hence, it never stops the step from executing again. STOP prevents this by ending the step after both files have been processed.

 

  • Step #2:
  1. The DO loop processes both file in succession, auto-setting setting DONE=0 from DONE=1 every time it sees an unread file in the directory.
  2. Program control goes back to the top of the loop and executes the INFILE again. 
  3. But because of MNAME="", DONE is reset to 0 only if there is a new unread file in the directory; therefore, it remains DONE=1.
  4. Because of that, DO WHILE loop doesn't execute the INPUT statement, and program control exists the loop immediately.
  5. Program control goes back to the top of the step (_N_=2).
  6. But now, at _N_=1, it has gone from the top to the bottom of the step without having read anything from either file1 or file 2.
  7. Therefore, the DATA step logic detects looping and stops the stop.

As @Astounding has sagely noted, this behavior should be accompanied by the following note in the log:

NOTE: DATA STEP stopped due to looping.

Yet for whatever reason, it doesn't happen when INFILE auto-reads all files from a directory, though it always happens (and should) when INPUT reads from a single file. Methinks it's a bug.    

 

At any rate:

  • If you code a DoW-loop to read the whole file (or all files from a directory) explicitly, always code STOP. 
  • It's better (and less fuzzy to boot) to code DO UNTIL (DONE) rather than DO WHILE (^ DONE). This is because with UNTIL, the body of the loop always executes at least once regardless of the UNTIL condition, and so INPUT (or SET or MERGE, etc.) within the loop is always given a chance to read from the empty buffer, which will stop the step without a need for the DATA step to detect looping. 

In particular, this:

data _null_ ;                                                                                                                           
  infile whatever end = done ;                                                                                                          
  do until (done);                                                                                                                      
    input ;                                                                                                                             
  end ;                                                                                                                                 
run ;       

will stop when INPUT has attempted to read from the empty file buffer, and no "DATA step stopped due to looping" note will be issued. 

Unlike this:

data _null_ ;                                                                                                                           
  infile whatever end = done ;                                                                                                          
  do while (^ done);                                                                                                                    
    input ;                                                                                                                             
  end ;                                                                                                                                 
run ;   

in which case you WILL receive the log note since when DONE=1, the INPUT statement will not be executed. So, if you prefer WHILE, end the step with STOP - or, better still, always do it when you read the entire file explicitly.

 

Kind regards

Paul D.

 

 

Quentin
Super User

@Steve1964 , it looks like you accidentally marked your "thanks" comment as the correct answer, rather than marking Hashman's answer as correct.  You should be able to reverse this.  Marking the correct answer results in that answer being displayed first after the question.

The Boston Area SAS Users Group is hosting free webinars!
Next webinar will be in January 2025. Until then, check out our archives: https://www.basug.org/videos. And be sure to subscribe to our our email list.

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 7 replies
  • 1508 views
  • 7 likes
  • 5 in conversation