Hello, I'm reading the advanced cert. prep guide and I got two questions about the code below:
data work.quarter;
do i = 9, 10, 11;
nextfile="c:\sasuser\month"
!!compress(put(i,2.)!!".dat",’ ’);
do until (lastobs);
infile temp filevar=nextfile end=lastobs;
input Flight $ Origin $ Dest $ Date : date9.
RevCargo : comma15.2;
output;
end;
end;
stop;
run;
Question 1:
The code is to read in multiple text files (i.e. month9.txt, month10.txt and month11.txt). The STOP statement (in the second last line) is for the DO loop according to the prep guide. I'm wondering why it is needed when the loop is supposed to end after i = 11.
Question 2:
I'm also wondering why we need the OUTPUT statement (line #9). Can anyone explain the process behind this?
Thank you.
Sam
When there is no change to the filevar=variable, SAS can use the EOF of that one file to terminate the data step.
As soon as there is a change, SAS cannot be sure that any given infile is the last one and disregards the EOF marker, therefore causing a new iteration of the data step. Since that causes a new change in the filevar (start with the first of your 3 input files), the process starts again.
If you do
do i = 11 to 11;
in your original step, you can omit the stop statement. As soon as there is a change in nextfile (and the SAS interpreter seems to be quite intelligent at inferring a possible change from the code), you get an infinite loop.
Hello,
Actually it is not for the DO loop , but for the datastep.
It is clearly stated in the book : "STOP statement
prevents an infinite loop of the DATA step."
Thanks Loko for the prompt response. Could you explain what causes the infinite loop of the data step? From what I understand, there is just one do-loop that goes from 9, 10 and 11. I am not sure what causes the data step to run infinitely.
Without the stop statement, the data step would automatically go into its next iteration and do the whole process over again. And again. And again ....
Since you do not have a set or similar statement where the data step could get an EOF-marker from the contributing dataset(s), it would iterate forever.
Thanks Kurt. Sorry I just wanna make sure I fully understand this. I simplified the program and removed the two DO-loops:
data work.quarter;
nextfile="c:\sasuser\month9.dat";
infile temp filevar=nextfile end=lastobs;
input Flight $ Origin $ Dest $ Date : date9.
RevCargo : comma15.2;
run;
I will be reading in only the month9.dat. In this example, there is no SET statement either but it runs fine without the STOP statement. Could you explain why?
Thanks so much.
Sam
When there is no change to the filevar=variable, SAS can use the EOF of that one file to terminate the data step.
As soon as there is a change, SAS cannot be sure that any given infile is the last one and disregards the EOF marker, therefore causing a new iteration of the data step. Since that causes a new change in the filevar (start with the first of your 3 input files), the process starts again.
If you do
do i = 11 to 11;
in your original step, you can omit the stop statement. As soon as there is a change in nextfile (and the SAS interpreter seems to be quite intelligent at inferring a possible change from the code), you get an infinite loop.
Kurt, your explanation is extremely clear. Thanks so much!
Now I got another question about the Do-until loop.
data work.quarter;
do i = 9, 10, 11;
nextfile="c:\sasuser\month"
!!compress(put(i,2.)!!".dat",’ ’);
* do until (lastobs);
infile temp filevar=nextfile end=lastobs;
input Flight $ Origin $ Dest $ Date : date9.
RevCargo : comma15.2;
output;
* end;
end;
stop;
run;
I ran the code above with the DO-UNTIL loop commented out and SAS read in only the first record from each external file. Could you explain why SAS behave like this? I just want to fully understand why the DO-UNTIL loop is needed here.
The infile statement has two functions:
- it names the infile and sets parameters
- but it also marks the place in the code where the read is performed.
So, in the do i = 9, 10, 11 loop, the filename is set, the file is opened, a read is performed, and then the next iteration of the do begins, causing a new file to be opened.
The do until(lastobs) creates an explicit loop over the infile.
The output in the inner do loop is necessary, as without it the data step would have one implicit output set up at the end of the data step, which is never reached because of the stop.
That's some great explanation. Very much appreciate it, Kurt.
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.