BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
robulon
Quartz | Level 8
Hi, I'm hoping for some suggestions on how to go about solving a problem. I have a total of 253 payments due that should add up to £14,843.87. The total payment received is £14,661.00 but I am unable to tell how many individual payments make up this amount. What I am hoping to do is run some code that will identify the possible combinations of the due payments that add up to the missing £182.87. I know the missing amount must be made up of more than one payment (as the biggest single payment is £115.40), and cannot be made up of more than six payments (as the lowest seven payments add up to £189.89). Is there a way I can run some code that will look at the different possible combinations of the payment amounts and tell me which one(s) add up to £182.87. As always, really grateful for any advice on how to go about solving this. Thanks, Rob
1 ACCEPTED SOLUTION

Accepted Solutions
s_lassen
Meteorite | Level 14

You could try something like this, here with example data:

data have;
  do id=1 to 253;
    payment=mod(id,19)+1;
    output;
    end;
run;

proc sort data=have;
  by payment;
run;

%let wanted=43; /* the sum that you are looking for */
data want;
  array payments (1000) 8 _temporary_;
  do _N_=1 to nobs;
    set have nobs=nobs;
      payments(_N_)=payment;
      end;
  array idx 8 idx1-idx6;
  %macro check;
  %local comb;
  %do comb=2 %to 6;
  call missing(of idx1-idx&comb);
  ncomb=comb(nobs,&comb);
  do i=1 to ncomb;
    rc=lexcombi(nobs,&comb,of idx1-idx&comb);
    sum=0;
    do _N_=1 to &comb;
      sum+payments(idx(_N_));
      end;
    if round(sum,0.01)>&wanted then leave;
    if round(sum,0.01)=&wanted then output;
    end;
  %end;
  %mend;
  %check;
keep idx1-idx6; run;

Sorting is necessary to break out of the loop as soon as values get too large, otherwise it will take forever. (I just changed the sort to be by payment, not ID, as @FreelanceReinh reminded me, the other was a slip of fingers on the keyboard).

The variables idx1-idx6 should contain the observation numbers (in the sorted data) of observations that add up to &wanted.

 

The ROUND function calls are not necessary in this example, as the test data has no decimals. But with decimals, you may have to round in order to catch all the possible fits.

 

The reason for the macro construct is only that if you have too many indices in the LEXCOMBI call, they will all be set to missing, so it cannot be done in a data step loop.

View solution in original post

8 REPLIES 8
s_lassen
Meteorite | Level 14

You could try something like this, here with example data:

data have;
  do id=1 to 253;
    payment=mod(id,19)+1;
    output;
    end;
run;

proc sort data=have;
  by payment;
run;

%let wanted=43; /* the sum that you are looking for */
data want;
  array payments (1000) 8 _temporary_;
  do _N_=1 to nobs;
    set have nobs=nobs;
      payments(_N_)=payment;
      end;
  array idx 8 idx1-idx6;
  %macro check;
  %local comb;
  %do comb=2 %to 6;
  call missing(of idx1-idx&comb);
  ncomb=comb(nobs,&comb);
  do i=1 to ncomb;
    rc=lexcombi(nobs,&comb,of idx1-idx&comb);
    sum=0;
    do _N_=1 to &comb;
      sum+payments(idx(_N_));
      end;
    if round(sum,0.01)>&wanted then leave;
    if round(sum,0.01)=&wanted then output;
    end;
  %end;
  %mend;
  %check;
keep idx1-idx6; run;

Sorting is necessary to break out of the loop as soon as values get too large, otherwise it will take forever. (I just changed the sort to be by payment, not ID, as @FreelanceReinh reminded me, the other was a slip of fingers on the keyboard).

The variables idx1-idx6 should contain the observation numbers (in the sorted data) of observations that add up to &wanted.

 

The ROUND function calls are not necessary in this example, as the test data has no decimals. But with decimals, you may have to round in order to catch all the possible fits.

 

The reason for the macro construct is only that if you have too many indices in the LEXCOMBI call, they will all be set to missing, so it cannot be done in a data step loop.

FreelanceReinh
Jade | Level 19

@s_lassen: Nice approach! Wise to use the ROUND function.

 

I assume the intention was to sort by payment, not by id.

 

I'd suggest modifying the stopping rule a bit, e.g.

if round(sum,0.01)>&wanted & i>&maxiter then leave;

where macro variable maxiter would be defined as a "large" integer, e.g. 1e7. (There might be more efficient stopping rules than this.)

 

Maybe my test dataset, shown below, containing random numbers (or the fact that I chose the uniform distribution to generate them) was unfortunate, but with the original stopping rule no matching sum was found.

 

data have;
call streaminit(27182818);
do _n_=1 to 253;
  payment=round(rand('uniform',1,115.4),.01);
  output;
end;
run; 

proc print data=have(obs=4);
sum payment;
run; /* --> wanted=252.22 */

proc sort data=have;
by payment;
run;

Without a stopping rule 9511 combinations (of 2 [did not occur], 3 or 4 payments -- I omitted the checks of 5 and 6) summing to 252.22 were found, with the modified stopping rule (using maxiter=1e7) at least 372 of them (not including the a priori "known" combination, though). Run times were <20 and <2 seconds, respectively.

 

@robulon: The above example indicates that you might get too many solutions: thousands or even millions (see below!) of different combinations summing to 182.87. This is because the number of combinations exceeds the number of possible sums by several orders of magnitude (assuming that all payment values are integer multiples of 0.01). A coarse lower bound for the average number of combinations adding up to the same given sum can be obtained as follows:

data _null_;
do k=2 to 6;
  s+comb(253,k)/(k*11540);
end;
put s=;
run; /* Result: s=5103162.1672 */

So, it would be surprising if none of the sums of 2, 3 or 4 payments were equal to 182.87. (And these cases are manageable with s_lassen's solution without the stopping rule.) Even if that happened, you could proceed by selecting one combination (of 4 payments) per distinct sum and combine these (<50,000) sums, restricted to sums <182.87, in each case with the remaining 249 payment values. Thus (most likely), you'll find a suitable combination of 5 payments and then many more when you go back to the pertinent sum of 4 payments and select all combinations leading to that particular sum.

s_lassen
Meteorite | Level 14

Yes, of course, the data should be sorted by payment. I shall edit my post to show the right variable. 

robulon
Quartz | Level 8

Thank you @s_lassen and @FreelanceReinhard, and apologies for not expressing my thanks earlier, I got a bit caught up in other matters. I've tried to resolve my issue using this solution but it appears to be falling down somewhere. Although I now know the figures that make up the total (obtained through other means but would still be great for future reference to get this sorted) are 75.9 and 106.97, the code does not identify this particular combination. In fact, it does not identify any combinations at all. I have tried amending the code using round(,0.1) and this does return some combinations but not the exact one.

 

I've tried some more testing putting the figures in as datalines to reduce the number of figures but it still isn't working. Would you be able to suggest where I'm going wrong? I'm wondering if it is something to do with the decimal places but have tried multiplying all the figures by 100 to turn them all into integers but am still not arriving at the solution.

 

This is what I'm using

 

data amounts;

input amount;

datalines;

 

75.9

106.97

100

255

643.59

251.25

;

run;

%let outstanding = 187.87;

 

 

data want;

array payments (1000) 8 _temporary_;

do _N_= 1 to nobs;

set amounts nobs = nobs;

 

payments(_N_) = amount;

end;

array idx 8 idx1 - idx6;

%macro check;

%local comb;

%do comb = 2 %to 6;

call missing(of idx1 - idx&comb.);

ncomb = comb(nobs, &comb.);

do i = 1 to ncomb;

rc = lexcombi(nobs, &comb., of idx1 - idx&comb.);

sum = 0;

do _N_ = 1 to &comb.;

 

sum + payments(idx(_N_));

end;

if round(sum,0.01) > &outstanding. then leave;

if round(sum,0.01) = &outstanding. then output;

 

end;

%end;

%mend check;

 

%check;

keep idx1 - idx6;

run;

 

As always, any advice will be gratefully received.

 

Rob

FreelanceReinh
Jade | Level 19

In your new example note that 75.9+106.97=182.87, not 187.87.

 

For the general case: Have you tried loosening (or even omitting) the stopping rule as I suggested in my previous post?

robulon
Quartz | Level 8

Oh dear, how embarrassing Can't even try the excuse that it was a slip of the fingers as 2 and 7 are nowhere near each other!

 

The loosening of the stopping rule seems to have worked. I've created a macro variable of the target amount earlier in the code using sql to calculate the difference between the expected amount and the actual amount and have use the code

if sum > &outstanding. + 9 then leave;

 

Interestingly enough (for me anyway), if I added a value that was any less than 9 to the macro variable, no results were returned whereas if I use 9 or higher, it correctly identifies the amounts. 

 

Huge thanks for the help people, always impressed by the number of members who are happy to share their expertise and help us less knowledgeable types develop their skills. next step for me is to work my way through the code so that I understand how it's doing what it's doing.

 

Thanks again,

Rob 

 

 

 

 

Ksharp
Super User

It looks like a OR problem. Try post it at OR forum and @RobPratt  is there .

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 8 replies
  • 2607 views
  • 4 likes
  • 5 in conversation