Hello All,
First of all I want to thank this community which has been very helpful.
I am getting inconsistent result with a array program I am using. I am trying to sum different variables at the same time using array but I am getting incorrect results. Please help me pointing out the issue with the program.
data want;
set have;
by ID;
array class (*) CLASS_1-CLASS_30;
array add (*) SUM_CLASS_1-SUM_CLASS_30;
do i=1 to dim(class);
if first.ID then add(i) =0;
add(i) + class(i);
if last.ID;
end;
keep ID SUM_CLASS_1-SUM_CLASS_30;
run;
Note: CLASS_1-CLASS_30 variables has numeric value.
Thank you
There are two clear logic errors.
First you did not RETAIN the variables defined in the ADD array. So on each new iteration of the data step they will be set to missing.
Second you are stopping the data step in the middle of the DO loop. So only on the last observation in the ID group will any variable other than the first variable in the array have had a chance to have been incremented.
Try this instead:
data want;
set have;
by ID;
array class CLASS_1-CLASS_30;
array add SUM_CLASS_1-SUM_CLASS_30;
retain SUM_CLASS_1-SUM_CLASS_30;
if first.id then call missing(of add[*]);
do i=1 to dim(class);
add[i] + class[i];
end;
if last.ID;
keep ID SUM_CLASS_1-SUM_CLASS_30;
run;
There are two clear logic errors.
First you did not RETAIN the variables defined in the ADD array. So on each new iteration of the data step they will be set to missing.
Second you are stopping the data step in the middle of the DO loop. So only on the last observation in the ID group will any variable other than the first variable in the array have had a chance to have been incremented.
Try this instead:
data want;
set have;
by ID;
array class CLASS_1-CLASS_30;
array add SUM_CLASS_1-SUM_CLASS_30;
retain SUM_CLASS_1-SUM_CLASS_30;
if first.id then call missing(of add[*]);
do i=1 to dim(class);
add[i] + class[i];
end;
if last.ID;
keep ID SUM_CLASS_1-SUM_CLASS_30;
run;
Hello @abhi309,
Unless this is an exercise in DATA step programming, I would prefer PROC SUMMARY for this task:
proc summary data=have;
by id;
var class_1-class_30;
output out=want(drop=_:) sum=sum_class_1-sum_class_30;
run;
PS: The RETAIN in the DATA step is redundant thanks to the Sum statement. (Interestingly, even using the Sum statement with only one array reference, say add[17], would imply a RETAIN for all 30 elements of array add.)
A description of what that code is supposed to accomplish would be helpful. Examples of input data and expected/desired output would help as well.
When given code that gets "inconsistent result" without knowing the input or expected output we have to make many guesses as to what is going on and are quite likely to miss something.
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.