A piece of code like this ,
DATA one;
INPUT name $ a1-a5 ;
ARRAY a {*} a1-a2;
DO i=1 to 2;
a(i)=6 - a(i);
output; *...............(1);
END;
avrg=MEAN(OF a1-a5);
*DROP i; *................(2);
DATALINES;
Joe 5 5 5 5 5
Ray 1 1 1 1 1
RUN;
PROC PRINT DATA=one;
RUN;
------------------------------------------------------------------------------------------------------OUTPUT1
Obs name a1 a2 a3 a4 a5 avrg
1 Joe 1 5 5 5 5 .
2 Joe 1 1 5 5 5 .
3 Ray 5 1 1 1 1 .
4 Ray 5 5 1 1 1 .
_________________________________________________________OUTPUT2
Obs name a1 a2 a3 a4 a5 i avrg
1 Joe 1 1 5 5 5 3 3.4
2 Ray 5 5 1 1 1 3 2.6
So, you will have an output with 2 iterations on both Joe and Ray, make up to 4 lines for the output, with the i's both ending in 2 (i=1 and 2 for Joe and i=1 and two for Ray), with avrg as missing values for four lines;
HOWEVER,
if you commet (1) and uncommet (2) or not doesn't matter, the output has only two lines, with variables "name", "a1-a5", avrg listed and,
if you do keep the (2), you will find the i's are 3.
My understanding is that, when you comment (1), upon competing each dataline, SAS made the effort to take an extra step, i=3, to calculate the avrg and output it in the proc print procedure.
But what conduses me is that, why SAS didn't do think when we just put statement (1) there?
Does the output in the do-end loop prevent SAS from taking the extra step to competing "avrg" calculation and output? (or avrg is calculate anyway but just not outputted in the initial case?)
Please advice, I appreciate!!
The key is what happens when you use the OUTPUT statement in a data step.
If you do not have an output statement (and do not have a DELETE or RETURN either, to be exact) SAS automatically inserts an output just before the RUN statement.
If you do include an OUTPUT statement SAS assumes you do not want an automatic output so it will only output at the point where you specify.
In the case of your data step, when output is specified the output takes place before the statement avrg=MEAN(OF a1-a5);
At that point, there has been no calculation so avg is missing. SAS does do the calculation later, but then it is too late to include in the output data set and the result is wasted.
Also, when you have an output statement inside a do loop, an output row is produced each time it passes the loop so you get 2 records for each input.
SAS increments the loop variable at the end of the do loop, and then tests it to see whether it is still in the range specified. If it is not, then the loop stops. That is why i=3 (3 > 2, so stop looping) in your second dataset. If you do not drop i in your first dataset you wold have values 1 and 2 which are the values inside the loop.
Regards
Richard in Oz
The key is what happens when you use the OUTPUT statement in a data step.
If you do not have an output statement (and do not have a DELETE or RETURN either, to be exact) SAS automatically inserts an output just before the RUN statement.
If you do include an OUTPUT statement SAS assumes you do not want an automatic output so it will only output at the point where you specify.
In the case of your data step, when output is specified the output takes place before the statement avrg=MEAN(OF a1-a5);
At that point, there has been no calculation so avg is missing. SAS does do the calculation later, but then it is too late to include in the output data set and the result is wasted.
Also, when you have an output statement inside a do loop, an output row is produced each time it passes the loop so you get 2 records for each input.
SAS increments the loop variable at the end of the do loop, and then tests it to see whether it is still in the range specified. If it is not, then the loop stops. That is why i=3 (3 > 2, so stop looping) in your second dataset. If you do not drop i in your first dataset you wold have values 1 and 2 which are the values inside the loop.
Regards
Richard in Oz
Thank you so much, Richard. :smileylaugh:
To extend my question, do you happen to know how to let SAS show the
step by step debugging activity?
I appreciate your help anyway.
-Sarah
Sarah
You are welcome.
I have not used the debugging facility in, oh, many years so I can't help. Perhaps someone else can.
If I want to see what the state of the variables in a data step is at any point I just insert the statement
Put _ALL_ ;
which will list all the current variables (including automatic variables) and their values at that point.
Then I can move that line to another point and see what has changed in the next run of the same step.
Richard in Oz.
Not much of a user of the debugging facility myself.
IYou need to specify the / debug option on the data statement to switch on the facility, then it pretty much a line command operation.
Check this good paper from S. David Riba about the subject:
http://www2.sas.com/proceedings/sugi25/25/btu/25p052.pdf
Cheers from Portugal.
Daniel Santos @ www.cgd.pt
Beside that there's also the subsetting IF, which is more of a coding style.
IF <condition>;
with no then statement, and is equivalent to:
IF <condition> then delete;
output is only made at the end of every iteration (if no other explicit output exists), but it may actually not happen because explicitly the execution will only reach the run if it passes successfully the IF.
http://support.sas.com/documentation/cdl/en/lrdict/64316/HTML/default/viewer.htm#a000201978.htm
Cheers from Portugal.
Daniel Santos @ www.cgd.pt
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.