Solved: Re: sum statement within DO UNTIL()

SAS_inquisitive · Posted 03-23-2016 12:02 PM

This was mentioned as "Henderson-Whitlock Original Form of DoW-loop" at SAScommunity.org. Since the SUM statement intializes itself to 0, why have sum and count variables been explicitly intialized to 0?

data a;
	input id $ var;
	datalines;
A 1 
A 2 
B 3 
B 4 
B 5 
;

data b;
	count= 0;
	sum = 0;

	do until ( last.id );
		set a;
		by id;
		count+1;
		sum+var;
	end;

	mean = sum / count;
run;

FreelanceReinh · Posted 03-23-2016 02:25 PM

That's true. The sum statement not only implies the initialization to 0, but also a RETAIN for the variable being incremented.

This implicit RETAIN conflicts with the DOW loop, one of whose major purposes is to have a "RETAIN effect" within the loop (i.e. within one iteration of the data step), but to let the standard data step behavior at the beginning of each iteration of the data step set (unretained) variables automatically to missing.

Therefore, in many DOW loops in practice the sum statement is avoided and the SUM function is used instead. Thus, the assignment statement looks a bit less elegant, but the advantage is that you can omit the initialization, because the SUM function does not imply a RETAIN. It shares with the sum statement the desired property of "missing plus x equals x" (unlike an assignment such as count=count+1).

There is only one case where the results are different: If all values added are missing, the sum statement returns 0 (due to the implicit initialization), whereas the SUM function returns a missing value, which is actually the more accurate result in many cases. (The "Missing values were generated ..." notes in the log are the downside.)

View solution in original post

ballardw · Posted 03-23-2016 12:28 PM

Take a look at the sum and mean for ID b when you run the code without the initialization.

SAS_inquisitive · Posted 03-23-2016 12:40 PM

@ballardw. I checked that before posting this. It gives correct result for first by group (A) but not for the second by group (B). For second group var is accumlated for both A and B. However, for this no intialization is required.

data a;
	input id $ var;
	datalines;
A 1 
A 2 
B 3 
B 4 
B 5 
;

data b;
	do until ( last.id );
		set a;
		by id;
		count=sum(count,1);
		sum=sum(sum,var);
	end;

	mean = sum / count;
run;

FreelanceReinh · Posted 03-23-2016 02:25 PM

That's true. The sum statement not only implies the initialization to 0, but also a RETAIN for the variable being incremented.

This implicit RETAIN conflicts with the DOW loop, one of whose major purposes is to have a "RETAIN effect" within the loop (i.e. within one iteration of the data step), but to let the standard data step behavior at the beginning of each iteration of the data step set (unretained) variables automatically to missing.

Therefore, in many DOW loops in practice the sum statement is avoided and the SUM function is used instead. Thus, the assignment statement looks a bit less elegant, but the advantage is that you can omit the initialization, because the SUM function does not imply a RETAIN. It shares with the sum statement the desired property of "missing plus x equals x" (unlike an assignment such as count=count+1).

There is only one case where the results are different: If all values added are missing, the sum statement returns 0 (due to the implicit initialization), whereas the SUM function returns a missing value, which is actually the more accurate result in many cases. (The "Missing values were generated ..." notes in the log are the downside.)

SAS_inquisitive · Posted 03-23-2016 03:33 PM

@FreelanceReinh. Thanks. This helped to see what is going inside one iteration of DOW LOOP.

data b;
	do until ( last.id );
	   put _all_;
		set a;
		put _all_;
		by id;
		put _all_;
		count=sum(count,1);
		put _all_;
		sum=sum(sum,var);
		put _all_;
	end;

	mean = sum / count;
run;

sum statement within DO UNTIL()

Re: sum statement within DO UNTIL()

Re: sum statement within DO UNTIL()

Re: sum statement within DO UNTIL()

Re: sum statement within DO UNTIL()

Re: sum statement within DO UNTIL()

SAS Innovate 2025: Save the Date

SAS Training: Just a Click Away