BookmarkSubscribeRSS Feed
ciro
Quartz | Level 8

Dear all, 

I have a dataset like this:

 

data have;
input id target var1 var2 var3 var4 var5 var6;
cards;
1 10000 -2000 10000 3000 7000 12000 30
2 7000 5000 1500 500 6500 750 .
;
run;

 

for each id I want to find all possible combinations of variables (var1-var6) whose sum among them is equal to target. Missing values do not have to enter in any combination.

The result I am looking for is something like:

 

data want;
input id var $ value sol1 sol2 sol3 ;
cards;
1 var1 -2000 0 0 1
1 var2 10000 1 0 0
1 var3 3000 0 1 0
1 var4 7000 0 1 0
1 var5 12000 0 0 1
1 var6 30 0 0 0
2 var1 5000 1 0 .
2 var2 1500 1 0 .
2 var3 500 1 1 .
2 var4 6500 0 1 .
2 var5 750 0 0 .
2 var6 . 0 0 .
;
run;

 

each variable sol (sol1-sol3) is equal to 1 if the corresponding var has to be considered in the sum.

any suggestion is very welcome.

Moreover it is very appreciated any advice on alternative structures of have and/or want data

Thank you in advance.

4 REPLIES 4
FreelanceReinh
Jade | Level 19

Hello @ciro,

 

The suggestion below is built around the GRAYCODE function, which generates all subsets of {VAR1, ..., VAR6}. It creates a WANT dataset in a different structure: one observation per solution with binary variables W1, ..., W6 indicating which of the VARi are included in the sum.

data _null_;
if 0 then set have;
array _[*] var:;
call symputx('n',dim(_));
stop;
run;

data want(keep=id target var: w:);
set have;
array v[*] var:;
array w[&n];
k=-1;
do j=1 to 2**&n;
  r=graycode(k, of w[*]);
  e=1; /* eligibility flag to exclude missing values */
  do i=1 to &n while(e);
    if w[i]=1 & v[i]=. then e=0;
  end;
  if e then do;
    s=0;
    do i=1 to &n;
      if w[i]=1 then s+v[i];
    end;
    if s=target then output;
  end;
end;
run;

 

Edit: If your real data contain non-integer  VARi values, make sure to round the sum S appropriately when comparing it to TARGET in order to avoid surprises (due to numeric precision issues) as shown in the log below:

800 data _null_;
801 retain var1-var3 target (33.3 33.4 33.3 100);
802 s=sum(of var:);
803 if s ne target then put 'Surprised?';
804 if round(s, 1e-9)=target then put 'OK!';
805 run;

Surprised?
OK!
Tom
Super User Tom
Super User

That GRAYCODE function is interesting.

Why not use it to generate a score dataset that you could use with PROC SCORE?

%let n=6;
data have;
  input id target var1-var&n ;
cards;
1 10000 -2000 10000 3000 7000 12000 30
2  7000 5000 1500 500 6500 750 .
;

data score;
  _type_='SCORE';
  length _name_ $32 ;
  array var[&n] ;
  nterms=-1;
  do j=1 to 2**&n;
    r=graycode(nterms,of var[*]);
    _name_=cats('X',of var[*]);
    if nterms then output;
  end;
  keep _type_ _name_ var: ;
run;

proc score data=have score=score out=wide;
  var var:;
  id id target;
run;

Then you could use PROC TRANSPOSE to get to something similar to the requested output.

proc transpose data=wide name=inputs out=solutions(where=(col1=target));
  by id target;
  var x: ;
run;

data solutions;
  sol+1;
  set solutions;
  by id target ;
  output;
  if last.target then sol=0;
  drop col1;
run;

Results

Tom_0-1757173246225.png

 

 

FreelanceReinh
Jade | Level 19

@Tom wrote:

That GRAYCODE function is interesting.

Why not use it to generate a score dataset that you could use with PROC SCORE?


That's a great idea which makes the code more efficient by avoiding unnecessary calls of the GRAYCODE function.

 

@ciro: To exclude the trivial solution W1=W2=...=W6=0 in case of TARGET=0 with my DATA step approach (as Tom's code does already), insert a conditional CONTINUE statement as shown below:

...
do j=1 to 2**&n;
  r=graycode(k, of w[*]);
  if r=0 then continue;
  ...
Ksharp
Super User

@FreelanceReinh  has already given you the GRAYCODE() solution.

But I would suggest to use a more sophisticated solution SAS/OR and calling @RobPratt 

 

The following code is for your first record 's solution . and I found 3 solutions:

And I believe @RobPratt  could give you better code , the condition is you have SAS/OR software.

 

data have;
input v $ Column1;
cards;
var1  -2000 
var2 10000 
var3 3000 
var4 7000 
var5 12000 
var6 30
;

proc optmodel;
set idx ;
str var{idx};
num x{idx};
var v{idx} binary;

read data have into  idx=[_n_] var=v x=Column1 ;
con con1:sum{i in idx} v[i]*x[i]=10000 ;
solve with clp/findallsolns;
create data want from [solution index]={s in 1.._NSOL_,i in idx} vname=var[i] value=x[i] flag=v[i].sol[s];
quit;

proc print;run;

Ksharp_0-1757150105766.png

 

hackathon24-white-horiz.png

2025 SAS Hackathon: There is still time!

Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!

Register Now

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 4 replies
  • 225 views
  • 2 likes
  • 4 in conversation