BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
SAS_inquisitive
Lapis Lazuli | Level 10

This was mentioned as "Henderson-Whitlock Original Form of DoW-loop" at SAScommunity.org. Since the SUM statement intializes itself to 0, why have sum and count variables been explicitly intialized to 0?

 

data a;
	input id $ var;
	datalines;
A 1 
A 2 
B 3 
B 4 
B 5 
;

data b;
	count= 0;
	sum = 0;

	do until ( last.id );
		set a;
		by id;
		count+1;
		sum+var;
	end;

	mean = sum / count;
run;
1 ACCEPTED SOLUTION

Accepted Solutions
FreelanceReinh
Jade | Level 19

That's true. The sum statement not only implies the initialization to 0, but also a RETAIN for the variable being incremented.

This implicit RETAIN conflicts with the DOW loop, one of whose major purposes is to have a "RETAIN effect" within the loop (i.e. within one iteration of the data step), but to let the standard data step behavior at the beginning of each iteration of the data step set (unretained) variables automatically to missing.

 

Therefore, in many DOW loops in practice the sum statement is avoided and the SUM function is used instead. Thus, the assignment statement looks a bit less elegant, but the advantage is that you can omit the initialization, because the SUM function does not imply a RETAIN. It shares with the sum statement the desired property of "missing plus x equals x" (unlike an assignment such as count=count+1).

 

There is only one case where the results are different: If all values added are missing, the sum statement returns 0 (due to the implicit initialization), whereas the SUM function returns a missing value, which is actually the more accurate result in many cases. (The "Missing values were generated ..." notes in the log are the downside.)

 

View solution in original post

4 REPLIES 4
ballardw
Super User

Take a look at the sum and mean for ID b when you run the code without the initialization.

 

 

SAS_inquisitive
Lapis Lazuli | Level 10

@ballardw. I checked that before posting this. It gives correct result for first by group (A) but not for the second by group (B).  For second group var is accumlated for both A and B. However, for this no intialization is required.

 

data a;
	input id $ var;
	datalines;
A 1 
A 2 
B 3 
B 4 
B 5 
;

data b;
	do until ( last.id );
		set a;
		by id;
		count=sum(count,1);
		sum=sum(sum,var);
	end;

	mean = sum / count;
run;
FreelanceReinh
Jade | Level 19

That's true. The sum statement not only implies the initialization to 0, but also a RETAIN for the variable being incremented.

This implicit RETAIN conflicts with the DOW loop, one of whose major purposes is to have a "RETAIN effect" within the loop (i.e. within one iteration of the data step), but to let the standard data step behavior at the beginning of each iteration of the data step set (unretained) variables automatically to missing.

 

Therefore, in many DOW loops in practice the sum statement is avoided and the SUM function is used instead. Thus, the assignment statement looks a bit less elegant, but the advantage is that you can omit the initialization, because the SUM function does not imply a RETAIN. It shares with the sum statement the desired property of "missing plus x equals x" (unlike an assignment such as count=count+1).

 

There is only one case where the results are different: If all values added are missing, the sum statement returns 0 (due to the implicit initialization), whereas the SUM function returns a missing value, which is actually the more accurate result in many cases. (The "Missing values were generated ..." notes in the log are the downside.)

 

SAS_inquisitive
Lapis Lazuli | Level 10

@FreelanceReinh.  Thanks. This helped  to see what is going inside one iteration of DOW LOOP.

 

data b;
	do until ( last.id );
	   put _all_;
		set a;
		put _all_;
		by id;
		put _all_;
		count=sum(count,1);
		put _all_;
		sum=sum(sum,var);
		put _all_;
	end;

	mean = sum / count;
run;

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 4 replies
  • 3012 views
  • 5 likes
  • 3 in conversation