Hello,
I have the following program :
data have;
input x $;
cards;
foo
bar
baz
;
run;
data whatever;
y=1;
run;
data _NULL_;
put "Hello";
do until(eof);
set have end=eof;
put x=;
end;
put "Goodbye";
run;
data _NULL_;
set whatever;
put "Hi";
do until(eof);
set have end=eof;
put x=;
end;
put "Bye bye";
run;
Which gives the following results when executed :
45
46
47 data _NULL_;
48 put "Hello";
49 do until(eof);
50 set have end=eof;
51 put x=;
52 end;
53 put "Goodbye";
54 run;
Hello
x=foo
x=bar
x=baz
Goodbye
Hello
NOTE: There were 3 observations read from the data set WORK.HAVE.
NOTE: DATA statement used (Total process time):
real time 0.00 seconds
cpu time 0.00 seconds
55
56 data _NULL_;
57 set whatever;
58 put "Hi";
59 do until(eof);
60 set have end=eof;
61 put x=;
62 end;
63 put "Bye bye";
64 run;
Hi
x=foo
x=bar
x=baz
Bye bye
NOTE: There were 1 observations read from the data set WORK.WHATEVER.
NOTE: There were 3 observations read from the data set WORK.HAVE.
NOTE: DATA statement used (Total process time):
real time 0.00 seconds
cpu time 0.00 seconds
I dont understand why, in the first data step, "Hello" is printed a second time at the end.
Is it possible to prevent this behaviour without reading an arbitrary input dataset as in
the second example ?
Thanks
The Do Loop stops executing at EOF but crucially not the data step so you get one final iteration which repeats the first Put Statement. You should put a STOP after your Do Loop like this
data _NULL_;
put "Hello";
do until(eof);
set have end=eof;
put x=; end;
put "Goodbye";
stop;
run;
Is it because, and just speculating, because the set statement has moved. In your first code, the put has to be executed on the last run to get to the point where the set statement reads the data to evaluate the EOF check, whereas in the second the EOF is evaluated earlier, so here:
data _NULL_;
set whatever;
Nothing to read on last, so stop, don't do put code.
Here:
data _NULL_;
put "Hello";
do until(eof);
set have end=eof;
We have to do the put, and evaluate the do , then read the next line, which will be EOF.
The Do Loop stops executing at EOF but crucially not the data step so you get one final iteration which repeats the first Put Statement. You should put a STOP after your Do Loop like this
data _NULL_;
put "Hello";
do until(eof);
set have end=eof;
put x=; end;
put "Goodbye";
stop;
run;
Thanks a lot @RW9 and @ChrisBrooks for these clear explanations and the solution to my problem.
To confirm @RW9 explanation, if the put statement is moved before the set statement, it will be executed
at the last iteration of the datastep :
68 data _NULL_;
69 put "Hi";
70 set whatever;
71 do until(eof);
72 set have end=eof;
73 put x=;
74 end;
75 put "Bye bye";
76 run;
Hi
x=foo
x=bar
x=baz
Bye bye
Hi
NOTE: There were 1 observations read from the data set WORK.WHATEVER.
NOTE: There were 3 observations read from the data set WORK.HAVE.
NOTE: DATA statement used (Total process time):
real time 0.03 seconds
cpu time 0.00 seconds
Hi, an other way is to control your put statements with:
data _NULL_;
set have end=eof;
if _N_ eq 1 then put "Hello";
put x=;
if eof then put "Goodbye";
run;
- Cheers -
Thanks for the advice. This is what i wanted to avoid though, since the "Hello" and "Goodbye" parts are in
fact big chunks of code so i try to avoid unnecessary identations.
ah ok, sounds interesting...
- Cheers -
Not really. Mostly put statements that generate a web page.
The application i work on was developped under SAS 9.1.3 which did not include proc stream. Switching to proc stream would be too big of a change to handle. Thanks anyway for the tip.
So later in the thread you clarify that this is a process that is generating a text file from a data set.
The key thing to remember is that most SAS data steps do not stop on the last statement. Instead they stop when SAS reads past the end of the input in either an INPUT or SET/MERGE/UPDATE statement.
So when you are using data to generate output and you want to smoothly handle empty files setup your data step like this:
data _null_;
file output ;
if _n_=1 then put
'<html>'
/'<body>'
;
if eof then put
'</body>'
/'</html>'
;
set have end=eof ;
put .... ;
run;
So the first pass writes the header. The last pass writes the footer. And then for each record read from the input something is written that will be in the middle. Note that f the input dataset is empty then both of the first 2 IF statements are true and the beginning and ending are written and then the data step stops when it can't read any data.
This will work even if you add a WHERE statement after the SET statement (or use a WHERE= dataset option).
You can extend it to use BY group processing also.
data _null_;
file output ;
if _n_=1 then put
'<html>'
/'<body>'
;
if eof then put
'</body>'
/'</html>'
;
set have end=eof ;
by tablename ;
if first.tablename then put
'<table>'
/'<th>' ...... '</th>'
;
put .... ;
if last.tablename then put
'</table>'
;
run;
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.