BookmarkSubscribeRSS Feed
Dawn3
Calcite | Level 5

Hello,
How to find if SUM of any 2 or more values from one dataset matches SUM of any 2 or more values from another dataset?
There are always at least two values (VAR1 and VAR2) in each dataset, and number of VARs are not necessarily the same.

Currently I am using this table to drive the process:

data TEST.COMBINATIONS;

  input VAR1 VAR2 VAR3 VAR4 VAR5 MAX_VAR_NUM CNT;

  datalines;

  1 2 . . . 2 1

  1 3 . . . 3 2

  2 3 . . . 3 3

  1 2 3 . . 3 4

  1 4 . . . 4 5

  2 4 . . . 4 6

  3 4 . . . 4 7

  2 3 4 . . 4 8

  1 2 3 4 . 4 9

  1 5 . . . 5 10

  2 5 . . . 5 11

  3 5 . . . 5 12

  4 5 . . . 5 13

  3 4 5 . . 5 14

  2 3 4 5 . 5 15

  2 4 5 . . 5 16

  1 2 3 4 5 5 17

;

run;

For example, if there are only two values in 1st dataset (MAX_VAR_NUM=2) than only SUM(VAR1, VAR2) is to be used.
If there are 3 values in 2nd dataset (MAX_VAR_NUM=3) than all combinations up to CNT 4 are to be compared with SUM(VAR1, VAR2) from 1st dataset:

SUM(VAR1, VAR2),
SUM(VAR1, VAR3),
SUM(VAR2, VAR3),
SUM(VAR1, VAR2, VAR3).

Is there a better way that hardcoding? The solution above is limited by the number of hardcoded combinations in the COMBINATIONS table.
Since SUM(VAR1, VAR2) = SUM(VAR2, VAR1) function LEXPERM was not helpful for this problem.

 

Thanks a lot!

2 REPLIES 2
Ksharp
Super User

Try function GRAYCODE() .

Can you post these two datasets ?

Dawn3
Calcite | Level 5

Hi Ksharp,

Yes, the GRAYCODE() could be useful for this problem.

Let’s simplify it to one value being matched with two or more:
data TEST.A_TWO;

  input TRANSACTION_ID AMOUNT MULTI_CNT CNT;

  datalines;

  100 2.10 1 1

  100 3.10 2 2

  200 3.00 1 3

  200 2.50 2 4

  200 2.50 3 5

  200 1.00 4 6

 ;

run;


%let val_1=5;               /* value to be matched         */

%let transaction_id=200;    /* within this transaction_id  */

proc sql;

  select count(*) into: MAX_NUM_OF_TRANS

  from TEST.A_TWO

  where TRANSACTION_ID = &transaction_id;

quit;

 

Now we know that there are 4 transactions of transaction_id=200 and we can use MULTI_CNT to identify them. The task remains to produce combinations: 1,2; 2,3;. . .; 1,2,3,4 to finally get combination '2,3' that leads to CNT 4 and 5 that uniquely identify the matches:
%let range=2,3;
proc sql;

  select SUM(AMOUNT) from TEST.A_TWO

  where TRANSACTION_ID = &transaction_id

  and MULTI_CNT IN (&range);

quit;

If there are multiple combinations that produce the same SUM, the first one is sufficient.
Thanks a lot for the ideas.

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

Creating Custom Steps in SAS Studio

Check out this tutorial series to learn how to build your own steps in SAS Studio.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 2 replies
  • 1215 views
  • 0 likes
  • 2 in conversation