Solved: Re: output statement in the outer loop of nested loop

SAS_inquisitive · Posted 01-26-2016 01:42 PM

This following code gives 4 records when placing output statement in the outer loop. How the program is executing outer output statement (i=4)?

data test;
set sashelp.class;

if _n_=1 then
do;
do i=1 to 3;
output;
end;

output;
end;
run;

FreelanceReinh · Posted 01-26-2016 01:58 PM

The value of an index variable such as i in "do i=1 to n;" is always n+1 after the loop has finished. In most cases you don't see this value, but with your additional OUTPUT statement you bring it to light.

Edit:

Assumption for the above statement: n is a non-negative integer.

View solution in original post

FreelanceReinh · Posted 01-26-2016 01:58 PM

The value of an index variable such as i in "do i=1 to n;" is always n+1 after the loop has finished. In most cases you don't see this value, but with your additional OUTPUT statement you bring it to light.

Edit:

Assumption for the above statement: n is a non-negative integer.

Astounding · Posted 01-26-2016 02:24 PM

Just to add a small detail ...

DO loops permit a BY value:

do i=1 to 13 by 5;

The final value of i must be greater than 13 to end the loop. In this case, the final value will be 16, not 14.

FreelanceReinh · Posted 01-26-2016 02:59 PM

That's true. Similarly, i will be

3 (not 3.5) after finishing "do i=1 to 2.5;"
2 after finishing "do i=2 to 0;"
1 after finishing "do i=7 to 4 by -3;"

But the real surprises start when numeric representation issues come into play:

After finishing "do i=0 to 1 by 0.1;" (strongly discouraged!) i is not equal to 1.1 (although very close) on many platforms.

Babloo · Posted 01-27-2016 02:09 AM

May I request you to tell me to find out the 'i' value?

FreelanceReinh · Posted 01-27-2016 04:51 AM

@Babloo: Do you mean the value of i after finishing "do i=a to b by c;" for arbitrary numbers a, b, c?

In which particular case would you be unsure, given the examples?

And why would you need this value? In many cases the index variable i will be dropped anyway. Sometimes I use the automatic variable _n_ as the index variable because it is not written to the output dataset by default. In situations where the index variable is kept in the output dataset, the intention is typically to have the values "from a to b" (i.e. a, a+c, a+2c, ..., b) in this dataset, nothing more.

Babloo · Posted 01-27-2016 05:14 AM

I was asking about the 'i' value which you mentioned in your post as below.

i will be

3 (not 3.5) after finishing "do i=1 to 2.5;"
2 after finishing "do i=2 to 0;"
1 after finishing "do i=7 to 4 by -3;"

Astounding · Posted 01-27-2016 08:49 AM

Babloo,

Here's the process that takes place. At the END statement, the software takes two steps:

Increment i by the amount of the BY value
Check to see if the loop is over. (That means, check if the incremented value of i exceeds the final value specified in the loop.)

If the loop is over, just continue with the rest of the DATA step. If the loop isn't over, run through it again using the incremented value of i.

This can lead to some interesting test cases. You wouldn't code this in real life, but see if you can work out what the final value of i would be in this loop:

do i=1 to 5;

i = 2 * i;

end;

Good luck.

FreelanceReinh · Posted 01-27-2016 09:53 AM

@Babloo: I hoped the pattern would become clear in the examples.

@Astounding: Thanks for the explanation. I had already started to write something similar, but had to leave my office.

Here it is:

Let's consider an iterative DO loop of the form "do i=a to b by c;" with reasonable numbers a, b and c, c ne 0. Please note that c=1 by default if it is not specified, i.e. in the case "do i=a to b;". Let's further assume that the loop is not left prematurely due to a statement such as LEAVE, GOTO or LINK and that the value of variable i is not changed by assignment statements and the like.

My understanding is:

Step 1: At the beginning of the loop, i is set to a.

Step 2: It is checked whether an iteration of the loop will occur. No iteration will occur (i.e., the code inside the DO loop will not be executed) if either c>0 & a>b or c<0 & a<b. In this case, we still have i=a after the loop. (The corresponding example was "do i=2 to 0;".) Otherwise, continue with Step 3.

Step 3: The code inside the DO loop is executed with the current value of i.

Step 4: Variable i is incremented like i=i+c.

Step 5: It is checked whether another iteration of the loop will occur. No further iteration will occur and the loop will finish if either c>0 & i>b or c<0 & i<b. In this case, the current value of i is the one which will be present immediately after finishing the loop. Otherwise, continue again with Steps 3, 4, 5.

Classroom Training Available!