Hello,
How to find if SUM of any 2 or more values from one dataset matches SUM of any 2 or more values from another dataset?
There are always at least two values (VAR1 and VAR2) in each dataset, and number of VARs are not necessarily the same.
Currently I am using this table to drive the process:
data TEST.COMBINATIONS;
input VAR1 VAR2 VAR3 VAR4 VAR5 MAX_VAR_NUM CNT;
datalines;
1 2 . . . 2 1
1 3 . . . 3 2
2 3 . . . 3 3
1 2 3 . . 3 4
1 4 . . . 4 5
2 4 . . . 4 6
3 4 . . . 4 7
2 3 4 . . 4 8
1 2 3 4 . 4 9
1 5 . . . 5 10
2 5 . . . 5 11
3 5 . . . 5 12
4 5 . . . 5 13
3 4 5 . . 5 14
2 3 4 5 . 5 15
2 4 5 . . 5 16
1 2 3 4 5 5 17
;
run;
For example, if there are only two values in 1st dataset (MAX_VAR_NUM=2) than only SUM(VAR1, VAR2) is to be used.
If there are 3 values in 2nd dataset (MAX_VAR_NUM=3) than all combinations up to CNT 4 are to be compared with SUM(VAR1, VAR2) from 1st dataset:
SUM(VAR1, VAR2),
SUM(VAR1, VAR3),
SUM(VAR2, VAR3),
SUM(VAR1, VAR2, VAR3).
Is there a better way that hardcoding? The solution above is limited by the number of hardcoded combinations in the COMBINATIONS table.
Since SUM(VAR1, VAR2) = SUM(VAR2, VAR1) function LEXPERM was not helpful for this problem.
Thanks a lot!
Try function GRAYCODE() .
Can you post these two datasets ?
Hi Ksharp,
Yes, the GRAYCODE() could be useful for this problem.
Let’s simplify it to one value being matched with two or more:
data TEST.A_TWO;
input TRANSACTION_ID AMOUNT MULTI_CNT CNT;
datalines;
100 2.10 1 1
100 3.10 2 2
200 3.00 1 3
200 2.50 2 4
200 2.50 3 5
200 1.00 4 6
;
run;
%let val_1=5; /* value to be matched */
%let transaction_id=200; /* within this transaction_id */
proc sql;
select count(*) into: MAX_NUM_OF_TRANS
from TEST.A_TWO
where TRANSACTION_ID = &transaction_id;
quit;
Now we know that there are 4 transactions of transaction_id=200 and we can use MULTI_CNT to identify them. The task remains to produce combinations: 1,2; 2,3;. . .; 1,2,3,4 to finally get combination '2,3' that leads to CNT 4 and 5 that uniquely identify the matches:
%let range=2,3;
proc sql;
select SUM(AMOUNT) from TEST.A_TWO
where TRANSACTION_ID = &transaction_id
and MULTI_CNT IN (&range);
quit;
If there are multiple combinations that produce the same SUM, the first one is sufficient.
Thanks a lot for the ideas.
It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.
Check out this tutorial series to learn how to build your own steps in SAS Studio.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.