Turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

- Home
- /
- Programming
- /
- Programming
- /
- Identifying combinations of variable that add up to a set amount

Options

- RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page

🔒 This topic is **solved** and **locked**.
Need further help from the community? Please
sign in and ask a **new** question.

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Posted 11-26-2019 07:09 AM
(1287 views)

Hi, I'm hoping for some suggestions on how to go about solving a problem. I have a total of 253 payments due that should add up to £14,843.87. The total payment received is £14,661.00 but I am unable to tell how many individual payments make up this amount. What I am hoping to do is run some code that will identify the possible combinations of the due payments that add up to the missing £182.87. I know the missing amount must be made up of more than one payment (as the biggest single payment is £115.40), and cannot be made up of more than six payments (as the lowest seven payments add up to £189.89). Is there a way I can run some code that will look at the different possible combinations of the payment amounts and tell me which one(s) add up to £182.87. As always, really grateful for any advice on how to go about solving this. Thanks, Rob

1 ACCEPTED SOLUTION

Accepted Solutions

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

You could try something like this, here with example data:

```
data have;
do id=1 to 253;
payment=mod(id,19)+1;
output;
end;
run;
proc sort data=have;
by payment;
run;
%let wanted=43; /* the sum that you are looking for */
data want;
array payments (1000) 8 _temporary_;
do _N_=1 to nobs;
set have nobs=nobs;
payments(_N_)=payment;
end;
array idx 8 idx1-idx6;
%macro check;
%local comb;
%do comb=2 %to 6;
call missing(of idx1-idx&comb);
ncomb=comb(nobs,&comb);
do i=1 to ncomb;
rc=lexcombi(nobs,&comb,of idx1-idx&comb);
sum=0;
do _N_=1 to &comb;
sum+payments(idx(_N_));
end;
if round(sum,0.01)>&wanted then leave;
if round(sum,0.01)=&wanted then output;
end;
%end;
%mend;
%check;
```

keep idx1-idx6;
run;

Sorting is necessary to break out of the loop as soon as values get too large, otherwise it will take forever. (I just changed the sort to be by payment, not ID, as @FreelanceReinh reminded me, the other was a slip of fingers on the keyboard).

The variables idx1-idx6 should contain the observation numbers (in the sorted data) of observations that add up to &wanted.

The ROUND function calls are not necessary in this example, as the test data has no decimals. But with decimals, you may have to round in order to catch all the possible fits.

The reason for the macro construct is only that if you have too many indices in the LEXCOMBI call, they will all be set to missing, so it cannot be done in a data step loop.

8 REPLIES 8

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

You could try something like this, here with example data:

```
data have;
do id=1 to 253;
payment=mod(id,19)+1;
output;
end;
run;
proc sort data=have;
by payment;
run;
%let wanted=43; /* the sum that you are looking for */
data want;
array payments (1000) 8 _temporary_;
do _N_=1 to nobs;
set have nobs=nobs;
payments(_N_)=payment;
end;
array idx 8 idx1-idx6;
%macro check;
%local comb;
%do comb=2 %to 6;
call missing(of idx1-idx&comb);
ncomb=comb(nobs,&comb);
do i=1 to ncomb;
rc=lexcombi(nobs,&comb,of idx1-idx&comb);
sum=0;
do _N_=1 to &comb;
sum+payments(idx(_N_));
end;
if round(sum,0.01)>&wanted then leave;
if round(sum,0.01)=&wanted then output;
end;
%end;
%mend;
%check;
```

keep idx1-idx6;
run;

Sorting is necessary to break out of the loop as soon as values get too large, otherwise it will take forever. (I just changed the sort to be by payment, not ID, as @FreelanceReinh reminded me, the other was a slip of fingers on the keyboard).

The variables idx1-idx6 should contain the observation numbers (in the sorted data) of observations that add up to &wanted.

The ROUND function calls are not necessary in this example, as the test data has no decimals. But with decimals, you may have to round in order to catch all the possible fits.

The reason for the macro construct is only that if you have too many indices in the LEXCOMBI call, they will all be set to missing, so it cannot be done in a data step loop.

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

@s_lassen: Nice approach! Wise to use the ROUND function.

I assume the intention was to sort by *payment*, not by id.

I'd suggest modifying the stopping rule a bit, e.g.

if round(sum,0.01)>&wanted& i>&maxiterthen leave;

where macro variable maxiter would be defined as a "large" integer, e.g. 1e7. (There might be more efficient stopping rules than this.)

Maybe my test dataset, shown below, containing random numbers (or the fact that I chose the uniform distribution to generate them) was unfortunate, but with the original stopping rule no matching sum was found.

```
data have;
call streaminit(27182818);
do _n_=1 to 253;
payment=round(rand('uniform',1,115.4),.01);
output;
end;
run;
proc print data=have(obs=4);
sum payment;
run; /* --> wanted=252.22 */
proc sort data=have;
by payment;
run;
```

*Without* a stopping rule 9511 combinations (of 2 [did not occur], 3 or 4 payments -- I omitted the checks of 5 and 6) summing to 252.22 were found, with the modified stopping rule (using maxiter=1e7) at least 372 of them (not including the *a priori* "known" combination, though). Run times were <20 and <2 seconds, respectively.

@robulon: The above example indicates that you might get *too many* solutions: thousands or even millions (see below!) of different combinations summing to 182.87. This is because the number of combinations exceeds the number of possible sums by several orders of magnitude (assuming that all payment values are integer multiples of 0.01). A coarse lower bound for the *average* number of combinations adding up to the same given sum can be obtained as follows:

```
data _null_;
do k=2 to 6;
s+comb(253,k)/(k*11540);
end;
put s=;
run; /* Result: s=5103162.1672 */
```

So, it would be surprising if none of the sums of 2, 3 or 4 payments were equal to 182.87. (And these cases are manageable with s_lassen's solution without the stopping rule.) Even if that happened, you could proceed by selecting *one* combination (of 4 payments) per *distinct* sum and combine these (<50,000) sums, restricted to sums <182.87, in each case with the remaining 249 payment values. Thus (most likely), you'll find a suitable combination of 5 payments and then many more when you go back to the pertinent sum of 4 payments and select *all* combinations leading to that particular sum.

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Thank you @s_lassen and @FreelanceReinhard, and apologies for not expressing my thanks earlier, I got a bit caught up in other matters. I've tried to resolve my issue using this solution but it appears to be falling down somewhere. Although I now know the figures that make up the total (obtained through other means but would still be great for future reference to get this sorted) are 75.9 and 106.97, the code does not identify this particular combination. In fact, it does not identify any combinations at all. I have tried amending the code using round(,0.1) and this does return some combinations but not the exact one.

I've tried some more testing putting the figures in as datalines to reduce the number of figures but it still isn't working. Would you be able to suggest where I'm going wrong? I'm wondering if it is something to do with the decimal places but have tried multiplying all the figures by 100 to turn them all into integers but am still not arriving at the solution.

This is what I'm using

**data** amounts;

input amount;

datalines;

75.9

106.97

100

255

643.59

251.25

;

**run**;

%let outstanding = 187.87;

**data** want;

array payments (**1000**) **8** _temporary_;

do _N_= **1** to nobs;

set amounts nobs = nobs;

payments(_N_) = amount;

end;

array idx **8** idx1 - idx6;

**%macro** ** check**;

%local comb;

%do comb = **2** %to **6**;

call missing(of idx1 - idx&comb.);

ncomb = comb(nobs, &comb.);

do i = **1** to ncomb;

rc = lexcombi(nobs, &comb., of idx1 - idx&comb.);

sum = **0**;

do _N_ = **1** to &comb.;

sum + payments(idx(_N_));

end;

if round(sum,**0.01**) > &outstanding. then leave;

if round(sum,**0.01**) = &outstanding. then output;

end;

%end;

**%mend** check;

%** check**;

keep idx1 - idx6;

**run**;

As always, any advice will be gratefully received.

Rob

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

In your new example note that 75.9+106.97=18**2**.87, not 187.87.

For the general case: Have you tried loosening (or even omitting) the stopping rule as I suggested in my previous post?

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Oh dear, how embarrassing Can't even try the excuse that it was a slip of the fingers as 2 and 7 are nowhere near each other!

The loosening of the stopping rule seems to have worked. I've created a macro variable of the target amount earlier in the code using sql to calculate the difference between the expected amount and the actual amount and have use the code

if sum > &outstanding. + **9** then leave;

Interestingly enough (for me anyway), if I added a value that was any less than 9 to the macro variable, no results were returned whereas if I use 9 or higher, it correctly identifies the amounts.

Huge thanks for the help people, always impressed by the number of members who are happy to share their expertise and help us less knowledgeable types develop their skills. next step for me is to work my way through the code so that I understand how it's doing what it's doing.

Thanks again,

Rob

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

It looks like a OR problem. Try post it at OR forum and @RobPratt is there .

Are you ready for the spotlight? We're accepting content ideas for **SAS Innovate 2025** to be held May 6-9 in Orlando, FL. The call is **open **until September 25. Read more here about **why** you should contribute and **what is in it** for you!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.