07052016 11:19 AM
Hi,
I have some issues with the following code. The code works well for small datasets (ie. for 100 observations like in the sample code),but for some reason when I try to run it with a grater number of observations the code won't finish and the Worklibrary fills up (288Gb). I'm running the code on SAS EG version 5.1 if that has something to do with it.
data work.want;
set work.give (obs=100);
by ID1 ID2;
difference=0;
if first.ID1 then row=0;
if first.ID2 then row=0;
row +1;
output;
do while (difference < 100);
if last.ID2 then do;
_n_ = row;
call missing(ABC);
days = days + plus;
difference = abs(today()  days);
row = _n_+1;
output;
end;
end;
run;
Here's a quick scetch of what I would like to have:
I got:
ABC  ID1  ID2  Days  Plus 
123  1A  ac  1  50 
453  1A  bd  1  100 
654  2A  ac  1  50 
456  2A  jk  1  25 
234  3A  ac  1  50 



What I want:
ABC  ID1  ID2  Days  Plus 
123  1A  ac  1  50 
 1A  ac  51  50 
 1A  ac  101  50 
453  1A  bd  1  100 
 1A  bd  101  100 
654  2A  ac  1  50 
 2A  ac  51  50 
 2A  ac  101  50 
456  2A  jk  1  25 
 2A  jk  26  25 
 2A  jk  51  25 
 2A  jk  76  25 
 2A  jk  101  25 
234  3A  ac  1  50 
 3A  ac  51  50 
 3A  ac  101  50 
The dataset that I'm working with has around 5000 observations (=unique ID1  ID2 combinations). The variable in the dowhile loop should be somewhere around 1095. Variable "plus" varies between 5 and 3650. The current SAS setup that I'm working on is easily capeable of handling these kinds of datasets (I think this calculation would yield around 1M rows, that should be manageable).
What am I doing wrong? Are there any more efficient ways to program this?
Thanks a bunch in advance.
Regerds,
Tapir
07052016 11:35 AM
When I get LOTS of possibly unexpected output look at where OUTPUT statements are located.
I would examine all of the variables involved with this block of code:
do while (difference < 100);
if last.ID2 then do;
_n_ = row;
call missing(ABC);
days = days + plus;
difference = abs(today()  days);
row = _n_+1;
output;
end;
end;
If your PLUS variable ever is negative your value of difference has the potential to cause lots of issues with values of 1, 2, 3, 1000000.
I might suggest should the do while value be (0 <= difference <100) or similar