DATA Step, Macro, Functions and more

Iterative Sum

Reply
Occasional Learner
Posts: 1

Iterative Sum

I'm wanting to perform an iterative sum over a set of 24 consecutive random values that are a part of a total 96 values. My goal is to see which sum of 24 consecutive numbers (1-24, 2-25, 3-26, ... , 73-96) has the greatest sum. I've thought about using a macro or some kind of loop to do this, but I'm unsure exactly where to start. Any ideas?

Super User
Posts: 10,217

Re: Iterative Sum

Please supply some example data, so we have something to run code against. See my footnotes for how to post data.

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
How to post code
Super User
Posts: 13,523

Re: Iterative Sum

[ Edited ]

You don't say where your initial values come from. This is a simulation that does what I think you are asking but is only one way and what you want for actual output may not quite match.

data example;
   array v {96};
   /* populate the array of values*/
   do i = 1 to dim(v);
      v[i] = rand('uniform');
   end;
   array s {73};
   /* create sums of groups 1-24, 2-25 etc of the V values*/
   do i= 1 to dim(s);
      do j = i to (i+23); 
         if j=i then s[i] = v[j];
         else s[i]=s[i]+v[j];
      end;
   end;
   /* find the max sum*/
   maxs = max(of s(*));
   /* which of the S values has the max*/
   maxi= whichn(maxs,of s(*));
run;

 

 

Nothing slick just brute force. Your might want to work out why the magic numbers 73 and 23 are there.

If you don't actually want the v or s values in the final result drop them.

If you want more records to create multiple simulations wrap the code in another do loop to create multiple runs and add an OUTPUT statement inside that loop, probably after the assignment of the maxi variable.

 

Note that with this approach there is likely some rounding going on that the default display formats may or may not show in the totals of each group. If you see two values that appear equal in the S array you might set formats for S variables to something like f19.16 and see if they are still equal.

Valued Guide
Posts: 590

Re: Iterative Sum

How does you data look like? There are several ways to do it. If you have group by variables that you might use transpose and then arrays to calculate the consecutive values.

 

Another way might be something like:

data have;
do start=1 to 100;
val=start+10;
output;
end;
run;

proc sql;
create table test as 
select start,start+23 as end_Val,max(sum) as Max_Sum
		from (select a.start,sum(b.val) as sum
					from have a,have b
						where a.start<=b.start and b.start-a.start<=23
					group by a.start)
having max(sum)=sum
;
quit;
Thanks,
Suryakiran
Esteemed Advisor
Posts: 5,524

Re: Iterative Sum

Create dataset want:

 

%let totaln=96;
%let grpSize=24;

data test;
call streaminit(19867896);
do i = 1 to &totaln;
    x = rand("uniform");
    output;
    end;
run;

data want;
set test;
do grpId = max(1, i - &grpSize + 1) to min(&totaln - &grpSize + 1, i);
    output;
    end;
run;

proc sort data=want; by grpId i; run;

Then do your stats on dataset want by grpId.

 

PG
PROC Star
Posts: 8,164

Re: Iterative Sum

Here is an approach that uses a FIFO stack to create and sum all 73 subsets:

 

%let totaln=96;
%let grpSize=24;
%let numrecs=100000;

data test (drop=i j);
  call streaminit(19867896);
  array v(&totaln.) var1-var&totaln.;
  do i=1 to &numrecs.;
    do j = 1 to &totaln;
      v(j) = rand("uniform");
    end;
    output;
  end;
run;

data want(drop=i j max);
  set test;
  array nums(*) var1-var96;
  array stack(0:23) _temporary_;
  max=0;
  startloc=0;
  do i=1 to 73;
    if i=1 then do j=1 to 24;
      stack(mod(j,24))=nums(j);
    end;
    else stack(mod(i-1,24))=nums(i+23);
    if sum(of stack(*)) gt max then do;
      max=sum(of stack(*));
      startloc=i;
    end;
  end;
run;

That produces the same results as @ballardw's code, but runs 5 times faster.

 

Art, CEO, AnalystFinder.com

 

Trusted Advisor
Posts: 1,246

Re: Iterative Sum

[ Edited ]

Here is an approach that does not use a FIFO stack or rather an implicit one (but @art297's dataset TEST).

 

data want(drop=i s);
  set test;
  array var[96];
  s=sum(of var1-var24);
  startloc=1;
  max=s;
  do i=2 to 73;
    s+(var[i+23]-var[i-1]);
    if s>max then do;
      max=s;
      startloc=i;
    end;
  end;
run;

That produces the same results as Art's code, but runs approx. 2.4 times faster (and keeps max; tested with &numrecs=5000000).

Ask a Question
Discussion stats
  • 6 replies
  • 104 views
  • 3 likes
  • 7 in conversation