Hello,
basically I'm writing a data step that for each ID writes in a new variable all the variables that are changed from the previous day
I'm thinking something like this:
PROC SORT DATA=ccr_report_power_hist OUT=hist; BY ID_ANAG DESCENDING valdate; run; DATA hist; SET hist; BY ID_ANAG; IF _n_ EQ 1 THEN; DO; current_line = -1; RETAIN current_line; END; IF first.id_anag AND last.id_anag EQ 0 THEN DO; first_date = valdate; ARRAY current_values[*] _CHARACTER_; current_line = _n_; END; IF _n_ = current_line + 1 THEN DO; first_date = valdate; ARRAY temp_values[*] _CHARACTER_; DO i=1 TO DIM(temp_values); same_values = ''; IF current_value[i] EQ temp_value[i] THEN DO; same_values = CATS(same_values, VNAME(current_value[i])); END; END; END; run;
I think I'm misunderstanding how arrays are used in sas. I get the "Undeclared array referenced" for both the arrays, but I'm quite sure that the first IF is entered at least once.
RETAIN and array declaration should never be written inside conditional blocks. This is misleading, as retain and array are acted upon during data step compilation, not data step execution. Therefore what you write is misleading.
This is also the reason you declare two arrays containing the same variables (and values). One would suffice.
IF current_value[i] EQ temp_value[i]
will always be true, as you just reference the same datastep variable.
BTW, your same_values would end up containing only the last different value, since you set it to empty within the do loop.
What you probably wanted was
data hist;
set hist;
by id_anag;
array temp {*} _character_;
length same_values $500; * put sufficient length here;
same_values = '';
do i = 1 to dim(temp);
if temp{i} eq lag(temp{i}) then same_values = cats(trim(same_values),vname(temp{i}));
end;
if first.id_anag then same_values = '';
run;
forgot to retain current_line. Wasn't the problem tho. edited the OP
RETAIN and array declaration should never be written inside conditional blocks. This is misleading, as retain and array are acted upon during data step compilation, not data step execution. Therefore what you write is misleading.
This is also the reason you declare two arrays containing the same variables (and values). One would suffice.
IF current_value[i] EQ temp_value[i]
will always be true, as you just reference the same datastep variable.
BTW, your same_values would end up containing only the last different value, since you set it to empty within the do loop.
What you probably wanted was
data hist;
set hist;
by id_anag;
array temp {*} _character_;
length same_values $500; * put sufficient length here;
same_values = '';
do i = 1 to dim(temp);
if temp{i} eq lag(temp{i}) then same_values = cats(trim(same_values),vname(temp{i}));
end;
if first.id_anag then same_values = '';
run;
Thanks your help was very instructive, I find the workings of arrays in SAS a little confusing.
Forgot the existence of LAG() very useful.
I have but one question. You put
if first.id_anag then same_values = '';
at the end so that is it's the first row on a block it is overwrited, isn't possible to encase the whole do loop in an if-then and avoid to execute it when it's the first row in a block?
That is done because of the lag() function. This function fills its FIFO queue only when called, so it is usually a bad idea to put it inside a conditional block. Using it in the if condition, OTOH, makes sure that it is called exactly once per datastep iteration and therefore always holds the value from the previous observation, and not one before that.
So instead of creating same_values conditionally, I instead cleaned it up afterwards in case a new group starts.
Very clear. Thanks.
There can be many additional issues once this one is cleared up, but this error message has an easy solution.
The array names and array references use different names. You added an "s" at the end of the array names, but left out the "s" later on when referring to array elements.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.