DATA Step, Macro, Functions and more

ARRAY referring to current row in do loop

Accepted Solution Solved
Reply
Contributor
Posts: 33
Accepted Solution

ARRAY referring to current row in do loop

[ Edited ]

Hello,

basically I'm writing a data step that for each ID writes in a new variable all the variables that are changed from the previous day

I'm thinking something like this:

 

PROC SORT DATA=ccr_report_power_hist OUT=hist;
	BY ID_ANAG DESCENDING valdate;
run;

DATA hist;
	SET hist;
	BY ID_ANAG;
	IF _n_ 	EQ 1 THEN;
		DO;
			current_line = -1;
			RETAIN current_line;
		END;
	IF first.id_anag AND last.id_anag EQ 0 THEN 
		DO;
			first_date = valdate;
			ARRAY current_values[*] _CHARACTER_;
			current_line = _n_;
		END;
	IF _n_ = current_line + 1 THEN
		DO;
			first_date = valdate;
			ARRAY temp_values[*] _CHARACTER_;
			DO i=1 TO DIM(temp_values);
				same_values = '';
				IF current_value[i] EQ temp_value[i] THEN
				DO;
					same_values = CATS(same_values, VNAME(current_value[i]));
				END;
			END;
		END;	
run;

I think I'm misunderstanding how arrays are used in sas. I get the "Undeclared array referenced" for both the arrays, but I'm quite sure that the first IF is entered at least once.


Accepted Solutions
Solution
‎10-17-2016 11:57 AM
Super User
Posts: 7,758

Re: ARRAY referring to current row in do loop

RETAIN and array declaration should never be written inside conditional blocks. This is misleading, as retain and array are acted upon during data step compilation, not data step execution. Therefore what you write is misleading.

This is also the reason you declare two arrays containing the same variables (and values). One would suffice.

IF current_value[i] EQ temp_value[i]

will always be true, as you just reference the same datastep variable.

 

BTW, your same_values would end up containing only the last different value, since you set it to empty within the do loop.

 

What you probably wanted was

data hist;
set hist;
by id_anag;
array temp {*} _character_;
length same_values $500; * put sufficient length here;
same_values = '';
do i = 1 to dim(temp);
  if temp{i} eq lag(temp{i}) then same_values = cats(trim(same_values),vname(temp{i}));
end;
if first.id_anag then same_values = '';
run;

 

 

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers

View solution in original post


All Replies
Contributor
Posts: 33

Re: ARRAY referring to current row in do loop

forgot to retain current_line. Wasn't the problem tho. edited the OP

Solution
‎10-17-2016 11:57 AM
Super User
Posts: 7,758

Re: ARRAY referring to current row in do loop

RETAIN and array declaration should never be written inside conditional blocks. This is misleading, as retain and array are acted upon during data step compilation, not data step execution. Therefore what you write is misleading.

This is also the reason you declare two arrays containing the same variables (and values). One would suffice.

IF current_value[i] EQ temp_value[i]

will always be true, as you just reference the same datastep variable.

 

BTW, your same_values would end up containing only the last different value, since you set it to empty within the do loop.

 

What you probably wanted was

data hist;
set hist;
by id_anag;
array temp {*} _character_;
length same_values $500; * put sufficient length here;
same_values = '';
do i = 1 to dim(temp);
  if temp{i} eq lag(temp{i}) then same_values = cats(trim(same_values),vname(temp{i}));
end;
if first.id_anag then same_values = '';
run;

 

 

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
Contributor
Posts: 33

Re: ARRAY referring to current row in do loop

Posted in reply to KurtBremser

Thanks your help was very instructive, I find the workings of arrays in SAS a little confusing.

Forgot the existence of LAG() very useful.

I have but one question. You put

if first.id_anag then same_values = '';

at the end so that is it's the first row on a block it is overwrited, isn't possible to encase the whole do loop in an if-then and avoid to execute it when it's the first row in a block? 

 

 

Super User
Posts: 7,758

Re: ARRAY referring to current row in do loop

That is done because of the lag() function. This function fills its FIFO queue only when called, so it is usually a bad idea to put it inside a conditional block. Using it in the if condition, OTOH, makes sure that it is called exactly once per datastep iteration and therefore always holds the value from the previous observation, and not one before that.

So instead of creating same_values conditionally, I instead cleaned it up afterwards in case a new group starts.

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
Contributor
Posts: 33

Re: ARRAY referring to current row in do loop

Posted in reply to KurtBremser

Very clear. Thanks.

Super User
Posts: 5,497

Re: ARRAY referring to current row in do loop

There can be many additional issues once this one is cleared up, but this error message has an easy solution.

 

The array names and array references use different names.  You added an "s" at the end of the array names, but left out the "s" later on when referring to array elements.

☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 6 replies
  • 441 views
  • 2 likes
  • 3 in conversation