BookmarkSubscribeRSS Feed
ciro
Quartz | Level 8

Dear all, 

I have a dataset like this:

 

data have;
input id target var1 var2 var3 var4 var5 var6;
cards;
1 10000 -2000 10000 3000 7000 12000 30
2 7000 5000 1500 500 6500 750 .
;
run;

 

for each id I want to find all possible combinations of variables (var1-var6) whose sum among them is equal to target. Missing values do not have to enter in any combination.

The result I am looking for is something like:

 

data want;
input id var $ value sol1 sol2 sol3 ;
cards;
1 var1 -2000 0 0 1
1 var2 10000 1 0 0
1 var3 3000 0 1 0
1 var4 7000 0 1 0
1 var5 12000 0 0 1
1 var6 30 0 0 0
2 var1 5000 1 0 .
2 var2 1500 1 0 .
2 var3 500 1 1 .
2 var4 6500 0 1 .
2 var5 750 0 0 .
2 var6 . 0 0 .
;
run;

 

each variable sol (sol1-sol3) is equal to 1 if the corresponding var has to be considered in the sum.

any suggestion is very welcome.

Moreover it is very appreciated any advice on alternative structures of have and/or want data

Thank you in advance.

9 REPLIES 9
FreelanceReinh
Jade | Level 19

Hello @ciro,

 

The suggestion below is built around the GRAYCODE function, which generates all subsets of {VAR1, ..., VAR6}. It creates a WANT dataset in a different structure: one observation per solution with binary variables W1, ..., W6 indicating which of the VARi are included in the sum.

data _null_;
if 0 then set have;
array _[*] var:;
call symputx('n',dim(_));
stop;
run;

data want(keep=id target var: w:);
set have;
array v[*] var:;
array w[&n];
k=-1;
do j=1 to 2**&n;
  r=graycode(k, of w[*]);
  e=1; /* eligibility flag to exclude missing values */
  do i=1 to &n while(e);
    if w[i]=1 & v[i]=. then e=0;
  end;
  if e then do;
    s=0;
    do i=1 to &n;
      if w[i]=1 then s+v[i];
    end;
    if s=target then output;
  end;
end;
run;

 

Edit: If your real data contain non-integer  VARi values, make sure to round the sum S appropriately when comparing it to TARGET in order to avoid surprises (due to numeric precision issues) as shown in the log below:

800 data _null_;
801 retain var1-var3 target (33.3 33.4 33.3 100);
802 s=sum(of var:);
803 if s ne target then put 'Surprised?';
804 if round(s, 1e-9)=target then put 'OK!';
805 run;

Surprised?
OK!
Tom
Super User Tom
Super User

That GRAYCODE function is interesting.

Why not use it to generate a score dataset that you could use with PROC SCORE?

%let n=6;
data have;
  input id target var1-var&n ;
cards;
1 10000 -2000 10000 3000 7000 12000 30
2  7000 5000 1500 500 6500 750 .
;

data score;
  _type_='SCORE';
  length _name_ $32 ;
  array var[&n] ;
  nterms=-1;
  do j=1 to 2**&n;
    r=graycode(nterms,of var[*]);
    _name_=cats('X',of var[*]);
    if nterms then output;
  end;
  keep _type_ _name_ var: ;
run;

proc score data=have score=score out=wide;
  var var:;
  id id target;
run;

Then you could use PROC TRANSPOSE to get to something similar to the requested output.

proc transpose data=wide name=inputs out=solutions(where=(col1=target));
  by id target;
  var x: ;
run;

data solutions;
  sol+1;
  set solutions;
  by id target ;
  output;
  if last.target then sol=0;
  drop col1;
run;

Results

Tom_0-1757173246225.png

 

 

FreelanceReinh
Jade | Level 19

@Tom wrote:

That GRAYCODE function is interesting.

Why not use it to generate a score dataset that you could use with PROC SCORE?


That's a great idea which makes the code more efficient by avoiding unnecessary calls of the GRAYCODE function.

 

@ciro: To exclude the trivial solution W1=W2=...=W6=0 in case of TARGET=0 with my DATA step approach (as Tom's code does already), insert a conditional CONTINUE statement as shown below:

...
do j=1 to 2**&n;
  r=graycode(k, of w[*]);
  if r=0 then continue;
  ...
Ksharp
Super User

@FreelanceReinh  has already given you the GRAYCODE() solution.

But I would suggest to use a more sophisticated solution SAS/OR and calling @RobPratt 

 

The following code is for your first record 's solution . and I found 3 solutions:

And I believe @RobPratt  could give you better code , the condition is you have SAS/OR software.

 

data have;
input v $ Column1;
cards;
var1  -2000 
var2 10000 
var3 3000 
var4 7000 
var5 12000 
var6 30
;

proc optmodel;
set idx ;
str var{idx};
num x{idx};
var v{idx} binary;

read data have into  idx=[_n_] var=v x=Column1 ;
con con1:sum{i in idx} v[i]*x[i]=10000 ;
solve with clp/findallsolns;
create data want from [solution index]={s in 1.._NSOL_,i in idx} vname=var[i] value=x[i] flag=v[i].sol[s];
quit;

proc print;run;

Ksharp_0-1757150105766.png

 

RobPratt
SAS Super FREQ

Extending the CLP approach by @Ksharp, here's a way to do it with a groupBy parameter in the runOptmodel action in SAS Optimization:

%let numVars = 6;

data mycas.have;
   input id target var1-var&numVars;
   cards;
1 10000 -2000 10000 3000 7000 12000 30
2 7000 5000 1500 500 6500 750 .
;

proc cas;
   source pgm;
   num target;
   set VARS init 1..&numVars;
   num a {VARS};
   read data have into target {j in VARS} <a[j]=col('var'||j)>;
   VARS = {j in VARS: a[j] ne .};

   var Select {VARS} binary;
   con SumToTarget:
      sum {j in VARS} a[j] * Select[j] = target;

   solve with clp / findallsolns;

   create data want(drop=j) from [solution j]={s in 1.._NSOL_, j in VARS} var=('var'||j) value=a[j] sol=Select[j].sol[s];
   endsource;
   action optimization.runOptmodel / code=pgm groupBy='id';
quit;

proc print data=mycas.want;
run;

RobPratt_0-1757349492678.png

 

ciro
Quartz | Level 8

Thank you very much everyone for all this ideas and code. Once again, this list proves to be an invaluable tool for support and for learning new things about SAS. 

@RobPratt as I don't have Sas Viya can you suggest the code only with SAS/OR?

Thank you very much in advance

RobPratt
SAS Super FREQ

As requested, here's an alternative approach that uses a COFOR loop in PROC OPTMODEL and works in both SAS/OR (SAS 9) and SAS Optimization (SAS Viya):

proc optmodel printlevel=0;
   set IDS;
   num target {IDS};
   set VARS_ALL = 1..&numVars;
   set VARS {IDS} init VARS_ALL;
   num a {id in IDS, VARS[id]};
   read data have into IDS=[id] target {j in VARS_ALL} <a[id,j]=col('var'||j)>;
   for {id in IDS} VARS[id] = {j in VARS_ALL: a[id,j] ne .};

   num id_this;
   var Select {VARS[id_this]} binary;
   con SumToTarget:
      sum {j in VARS[id_this]} a[id_this,j] * Select[j] = target[id_this];

   set SOLS {IDS};
   num SelectSol {id in IDS, SOLS[id], VARS[id]};
   cofor {id in IDS} do;
      put id=;
      id_this = id;
      solve with clp / findallsolns;
      SOLS[id_this] = 1.._NSOL_;
      for {s in SOLS[id_this], j in VARS[id_this]} SelectSol[id_this,s,j] = Select[j].sol[s];
   end;

   create data want(drop=j) from [id solution j]={id in IDS, s in SOLS[id], j in VARS[id]} var=('var'||j) value=a[id,j] sol=SelectSol;
quit;

 

Ksharp
Super User

@RobPratt 

When I am running your code, and get these error info . 

 

 

 

 
 1          OPTIONS NONOTES NOSTIMER NOSOURCE NOSYNTAXCHECK;
 68         
 69         %let numVars = 6;
 70         
 71         data have;
 72            input id target var1-var&numVars;
 73            cards;
 
 NOTE: 数据集 WORK.HAVE 有 2 个观测和 8 个变量。
 NOTE: “DATA 语句”所用时间(总处理时间):
       实际时间          0.00 秒
       用户 CPU 时间     0.00 秒
       系统 CPU 时间     0.00 秒
       内存              672.21k
       OS 内存           19876.00k
       时间戳           2025-09-10 上午07:23:33
       Step Count                        24  Switch Count  2
       页错误数                       0
       页回收数                     139
       页交换数                        0
       主动上下文切换数        10
       被动上下文切换数      0
       块输入操作数            0
       块输出操作数           264
       
 
 76         ;
 77         
 78         proc optmodel printlevel=0;
 79            set IDS;
 80            num target {IDS};
 81            set VARS_ALL = 1..&numVars;
 82            set VARS {IDS};
 83            num a {id in IDS, VARS[id]};
 84            read data have into IDS=[id] target {j in VARS_ALL} <a[id,j]=col('var'||j)>;
 ERROR: 数组元素“VARS[1]”在第 83 行第 22 列没有值。
 NOTE: 从数据集 WORK.HAVE. 读取了 1 个观测
 85            for {id in IDS} VARS[id] = {j in VARS_ALL: a[id,j] ne .};
 ERROR: 数组元素“VARS[1]”在第 83 行第 22 列没有值。
 86         
 87            num id_this;
 88            var Select {VARS[id_this]} binary;
 89            con SumToTarget:
 90               sum {j in VARS[id_this]} a[id_this,j] * Select[j] = target[id_this];
 91         
 92            set SOLS {IDS};
 93            num SelectSol {id in IDS, SOLS[id], VARS[id]};
 94            cofor {id in IDS} do;
 95               put id=;
 96               id_this = id;
 97               solve with clp / findallsolns;
 98               SOLS[id_this] = 1.._NSOL_;
 99               for {s in SOLS[id_this], j in VARS[id_this]} SelectSol[id_this,s,j] = Select[j].sol[s];
 100           end;
 NOTE: The COFOR statement is executing in single-machine mode.
 id=1
 NOTE: 问题生成将使用 2 个线程。
 NOTE: 先前的错误可能会导致问题无法正确解决。
 ERROR: 数组元素“VARS[1]”在第 88 行第 16 列没有值。
 NOTE: 由于先前的错误,无法创建问题实例。
 101        
 102           create data want(drop=j) from [id solution j]={id in IDS, s in SOLS[id], j in VARS[id]} var=('var'||j) value=a[id,j]
 102      ! sol=SelectSol;
 ERROR: 数组元素“SOLS[1]”在第 102 行第 67 列没有值。
 103        quit;
 NOTE: 由于出错,SAS 系统停止处理该步。
 NOTE: “PROCEDURE OPTMODEL”所用时间(总处理时间):
       实际时间          0.01 秒
       用户 CPU 时间     0.01 秒
       系统 CPU 时间     0.00 秒
       内存              1077.60k
       OS 内存           20648.00k
       时间戳           2025-09-10 上午07:23:33
       Step Count                        25  Switch Count  6
       页错误数                       4
       页回收数                     501
       页交换数                        0
       主动上下文切换数        48
       被动上下文切换数      0
       块输入操作数            928
       块输出操作数           168
       
 104        
 105        OPTIONS NONOTES NOSTIMER NOSOURCE NOSYNTAXCHECK;
 115        
RobPratt
SAS Super FREQ

@Ksharp Sorry, I had oversimplfiied the code, which I have now corrected.

 

The corrected line is:

set VARS {IDS} init VARS_ALL;

hackathon24-white-horiz.png

2025 SAS Hackathon: There is still time!

Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!

Register Now

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 9 replies
  • 379 views
  • 8 likes
  • 5 in conversation