I would like the lines below for c1= to c4= to go on for 14 columns. That means col5 is col14 and I should have c1= to c14=.
Could someone please share a macro to do this loop?:
data a; set tran;
const=0;
c1=log((sum(of col1-col1)+const)/(sum(of col2-col5)+const));
c2=log((sum(of col1-col2)+const)/(sum(of col3-col5)+const));
c3=log((sum(of col1-col3)+const)/(sum(of col4-col5)+const));
c4=log((sum(of col1-col4)+const)/(sum(of col5-col5)+const));
run;
To do that in macro code you will need to use a %DO loop. Note you will only be able to generate N-1 statements.
%macro running_log_sum(n);
%local i ;
%do i=1 %to &n-1;
c&i=log(sum(of const col1-col&i)/sum(of const col%eval(&i+1)-col&n));
%end;
%mend running_log_sum;
Then call the macro to generate those statements inside of your data step.
data a;
set tran;
const=0;
%running_log_sum(15)
run;
If you call it with N=7 you will get these statements generated.
MPRINT(RUNNING_LOG_SUM): c1=log(sum(of const col1-col1)/sum(of const col2-col7)); MPRINT(RUNNING_LOG_SUM): c2=log(sum(of const col1-col2)/sum(of const col3-col7)); MPRINT(RUNNING_LOG_SUM): c3=log(sum(of const col1-col3)/sum(of const col4-col7)); MPRINT(RUNNING_LOG_SUM): c4=log(sum(of const col1-col4)/sum(of const col5-col7)); MPRINT(RUNNING_LOG_SUM): c5=log(sum(of const col1-col5)/sum(of const col6-col7)); MPRINT(RUNNING_LOG_SUM): c6=log(sum(of const col1-col6)/sum(of const col7-col7));
Note if you are already using a macro to generate that data step then there is no need to generate another macro since you can just put the %DO loop into your current macro at that point.
If you want to do it without macro code then a little restructuring will make it more efficient.
data a;
set tran;
const=0;
array cols col1-col14 ;
array c [13] ;
total = sum(of const cols[*]);
left = const;
do i=1 to dim(cols)-1;
left+cols[i];
c[i] = log(left/(total-left));
end;
drop total left;
run;
You likely don't need any macro code. arrays in a data step should do it.
data want; set tran; Array c (14); /* This assumes that the variables C1 through c14 do not exist*/ /* this array statement will create C1 through c14*/ Array cl (*) col1 - col15 /* this assumes the varaibles col1 -col15 exist and that you meant to have col15 as the last instead of col5 (not actually stated in your problem*/ do i= 1 to 14; temp = sum(of cl[1] - cl[i ])+const)/(sum(of cl[i+1]-cl[15])+const); if temp > 0 the c[i]=log(temp); end; drop i temp; run;
Arrays are way of having shorthand references to groups of variables. The ARRAY statement creates the association with a (usually short name, such as C or CL above). Then and INDEX value that appears in parentheses identifies which specific value in numeric order of definition is meant to be used. So C[i] references C1, C2, etc to C14. Note that arithmetic may be used in the index to reference offsets such as CL[i+1] which references CL2 when i=1 or the second variable Col2 in the CL array.
Since log of negative or 0 values is undefined I create a temporary variable to hold the results of the sums and division and test that is valid before using the log function. That might not be needed but I have no idea what actual values are in your variables.
Arrays are often the first thing to investigate when doing a series of related or basically identical calculations.
Variable lists cannot be made with array references. Variable lists are evaluated while the data step is being compiled. Array references occur while the data step is executing. They are incompatible.
1 data test; 2 array x [10] (1:10); 3 i=1; 4 y=sum(of x[1]-x[10]); - 22 ERROR 22-322: Syntax error, expecting one of the following: ), ','. 5 put (_all_) (=); 6 run; NOTE: The SAS System stopped processing this step because of errors. WARNING: The data set WORK.TEST may be incomplete. When this step was stopped there were 0 observations and 12 variables. WARNING: Data set WORK.TEST was not replaced because this step was stopped. NOTE: DATA statement used (Total process time): real time 0.00 seconds cpu time 0.00 seconds
Hello @pink_poodle,
If your SAS installation (like my ageing SAS 9.4M5 from 2016) does not yet support the convenient variable list syntax using array references suggested by ballardw, you may want to resort to your idea of a macro loop, which could look like this:
%macro cdef(m);
%local i;
%do i=1 %to %eval(&m-1);
c&i=log((sum(of col1-col&i)+const)/(sum(of col%eval(&i+1)-col&m)+const));
%end;
%mend cdef;
data want;
set tran;
const=0;
%cdef(15)
run;
Note that your description
@pink_poodle wrote:
I would like the lines below for c1= to c4= to go on for 14 columns. That means col5 is col14 and I should have c1= to c14=.
is not consistent with your code, where the maximum col index equals the maximum c index plus 1, so not both maximum indices can be 14. Hence, you'll need to replace %cdef(15) with %cdef(14) in my suggested code if col15 does not exist in your real TRAN dataset.
To do that in macro code you will need to use a %DO loop. Note you will only be able to generate N-1 statements.
%macro running_log_sum(n);
%local i ;
%do i=1 %to &n-1;
c&i=log(sum(of const col1-col&i)/sum(of const col%eval(&i+1)-col&n));
%end;
%mend running_log_sum;
Then call the macro to generate those statements inside of your data step.
data a;
set tran;
const=0;
%running_log_sum(15)
run;
If you call it with N=7 you will get these statements generated.
MPRINT(RUNNING_LOG_SUM): c1=log(sum(of const col1-col1)/sum(of const col2-col7)); MPRINT(RUNNING_LOG_SUM): c2=log(sum(of const col1-col2)/sum(of const col3-col7)); MPRINT(RUNNING_LOG_SUM): c3=log(sum(of const col1-col3)/sum(of const col4-col7)); MPRINT(RUNNING_LOG_SUM): c4=log(sum(of const col1-col4)/sum(of const col5-col7)); MPRINT(RUNNING_LOG_SUM): c5=log(sum(of const col1-col5)/sum(of const col6-col7)); MPRINT(RUNNING_LOG_SUM): c6=log(sum(of const col1-col6)/sum(of const col7-col7));
Note if you are already using a macro to generate that data step then there is no need to generate another macro since you can just put the %DO loop into your current macro at that point.
If you want to do it without macro code then a little restructuring will make it more efficient.
data a;
set tran;
const=0;
array cols col1-col14 ;
array c [13] ;
total = sum(of const cols[*]);
left = const;
do i=1 to dim(cols)-1;
left+cols[i];
c[i] = log(left/(total-left));
end;
drop total left;
run;
Thank you so much, @ballardw , @FreelanceReinh and @Tom! These are very helpful solutions! I used the code chunk right above this message.
Catch the best of SAS Innovate 2025 — anytime, anywhere. Stream powerful keynotes, real-world demos, and game-changing insights from the world’s leading data and AI minds.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.