BookmarkSubscribeRSS Feed
ciro
Quartz | Level 8

Dear all, 

I have a dataset like this:

 

data have;
input id target var1 var2 var3 var4 var5 var6;
cards;
1 10000 -2000 10000 3000 7000 12000 30
2 7000 5000 1500 500 6500 750 .
;
run;

 

for each id I want to find all possible combinations of variables (var1-var6) whose sum among them is equal to target. Missing values do not have to enter in any combination.

The result I am looking for is something like:

 

data want;
input id var $ value sol1 sol2 sol3 ;
cards;
1 var1 -2000 0 0 1
1 var2 10000 1 0 0
1 var3 3000 0 1 0
1 var4 7000 0 1 0
1 var5 12000 0 0 1
1 var6 30 0 0 0
2 var1 5000 1 0 .
2 var2 1500 1 0 .
2 var3 500 1 1 .
2 var4 6500 0 1 .
2 var5 750 0 0 .
2 var6 . 0 0 .
;
run;

 

each variable sol (sol1-sol3) is equal to 1 if the corresponding var has to be considered in the sum.

any suggestion is very welcome.

Moreover it is very appreciated any advice on alternative structures of have and/or want data

Thank you in advance.

9 REPLIES 9
FreelanceReinh
Jade | Level 19

Hello @ciro,

 

The suggestion below is built around the GRAYCODE function, which generates all subsets of {VAR1, ..., VAR6}. It creates a WANT dataset in a different structure: one observation per solution with binary variables W1, ..., W6 indicating which of the VARi are included in the sum.

data _null_;
if 0 then set have;
array _[*] var:;
call symputx('n',dim(_));
stop;
run;

data want(keep=id target var: w:);
set have;
array v[*] var:;
array w[&n];
k=-1;
do j=1 to 2**&n;
  r=graycode(k, of w[*]);
  e=1; /* eligibility flag to exclude missing values */
  do i=1 to &n while(e);
    if w[i]=1 & v[i]=. then e=0;
  end;
  if e then do;
    s=0;
    do i=1 to &n;
      if w[i]=1 then s+v[i];
    end;
    if s=target then output;
  end;
end;
run;

 

Edit: If your real data contain non-integer  VARi values, make sure to round the sum S appropriately when comparing it to TARGET in order to avoid surprises (due to numeric precision issues) as shown in the log below:

800 data _null_;
801 retain var1-var3 target (33.3 33.4 33.3 100);
802 s=sum(of var:);
803 if s ne target then put 'Surprised?';
804 if round(s, 1e-9)=target then put 'OK!';
805 run;

Surprised?
OK!
Tom
Super User Tom
Super User

That GRAYCODE function is interesting.

Why not use it to generate a score dataset that you could use with PROC SCORE?

%let n=6;
data have;
  input id target var1-var&n ;
cards;
1 10000 -2000 10000 3000 7000 12000 30
2  7000 5000 1500 500 6500 750 .
;

data score;
  _type_='SCORE';
  length _name_ $32 ;
  array var[&n] ;
  nterms=-1;
  do j=1 to 2**&n;
    r=graycode(nterms,of var[*]);
    _name_=cats('X',of var[*]);
    if nterms then output;
  end;
  keep _type_ _name_ var: ;
run;

proc score data=have score=score out=wide;
  var var:;
  id id target;
run;

Then you could use PROC TRANSPOSE to get to something similar to the requested output.

proc transpose data=wide name=inputs out=solutions(where=(col1=target));
  by id target;
  var x: ;
run;

data solutions;
  sol+1;
  set solutions;
  by id target ;
  output;
  if last.target then sol=0;
  drop col1;
run;

Results

Tom_0-1757173246225.png

 

 

FreelanceReinh
Jade | Level 19

@Tom wrote:

That GRAYCODE function is interesting.

Why not use it to generate a score dataset that you could use with PROC SCORE?


That's a great idea which makes the code more efficient by avoiding unnecessary calls of the GRAYCODE function.

 

@ciro: To exclude the trivial solution W1=W2=...=W6=0 in case of TARGET=0 with my DATA step approach (as Tom's code does already), insert a conditional CONTINUE statement as shown below:

...
do j=1 to 2**&n;
  r=graycode(k, of w[*]);
  if r=0 then continue;
  ...
Ksharp
Super User

@FreelanceReinh  has already given you the GRAYCODE() solution.

But I would suggest to use a more sophisticated solution SAS/OR and calling @RobPratt 

 

The following code is for your first record 's solution . and I found 3 solutions:

And I believe @RobPratt  could give you better code , the condition is you have SAS/OR software.

 

data have;
input v $ Column1;
cards;
var1  -2000 
var2 10000 
var3 3000 
var4 7000 
var5 12000 
var6 30
;

proc optmodel;
set idx ;
str var{idx};
num x{idx};
var v{idx} binary;

read data have into  idx=[_n_] var=v x=Column1 ;
con con1:sum{i in idx} v[i]*x[i]=10000 ;
solve with clp/findallsolns;
create data want from [solution index]={s in 1.._NSOL_,i in idx} vname=var[i] value=x[i] flag=v[i].sol[s];
quit;

proc print;run;

Ksharp_0-1757150105766.png

 

RobPratt
SAS Super FREQ

Extending the CLP approach by @Ksharp, here's a way to do it with a groupBy parameter in the runOptmodel action in SAS Optimization:

%let numVars = 6;

data mycas.have;
   input id target var1-var&numVars;
   cards;
1 10000 -2000 10000 3000 7000 12000 30
2 7000 5000 1500 500 6500 750 .
;

proc cas;
   source pgm;
   num target;
   set VARS init 1..&numVars;
   num a {VARS};
   read data have into target {j in VARS} <a[j]=col('var'||j)>;
   VARS = {j in VARS: a[j] ne .};

   var Select {VARS} binary;
   con SumToTarget:
      sum {j in VARS} a[j] * Select[j] = target;

   solve with clp / findallsolns;

   create data want(drop=j) from [solution j]={s in 1.._NSOL_, j in VARS} var=('var'||j) value=a[j] sol=Select[j].sol[s];
   endsource;
   action optimization.runOptmodel / code=pgm groupBy='id';
quit;

proc print data=mycas.want;
run;

RobPratt_0-1757349492678.png

 

ciro
Quartz | Level 8

Thank you very much everyone for all this ideas and code. Once again, this list proves to be an invaluable tool for support and for learning new things about SAS. 

@RobPratt as I don't have Sas Viya can you suggest the code only with SAS/OR?

Thank you very much in advance

RobPratt
SAS Super FREQ

As requested, here's an alternative approach that uses a COFOR loop in PROC OPTMODEL and works in both SAS/OR (SAS 9) and SAS Optimization (SAS Viya):

proc optmodel printlevel=0;
   set IDS;
   num target {IDS};
   set VARS_ALL = 1..&numVars;
   set VARS {IDS} init VARS_ALL;
   num a {id in IDS, VARS[id]};
   read data have into IDS=[id] target {j in VARS_ALL} <a[id,j]=col('var'||j)>;
   for {id in IDS} VARS[id] = {j in VARS_ALL: a[id,j] ne .};

   num id_this;
   var Select {VARS[id_this]} binary;
   con SumToTarget:
      sum {j in VARS[id_this]} a[id_this,j] * Select[j] = target[id_this];

   set SOLS {IDS};
   num SelectSol {id in IDS, SOLS[id], VARS[id]};
   cofor {id in IDS} do;
      put id=;
      id_this = id;
      solve with clp / findallsolns;
      SOLS[id_this] = 1.._NSOL_;
      for {s in SOLS[id_this], j in VARS[id_this]} SelectSol[id_this,s,j] = Select[j].sol[s];
   end;

   create data want(drop=j) from [id solution j]={id in IDS, s in SOLS[id], j in VARS[id]} var=('var'||j) value=a[id,j] sol=SelectSol;
quit;

 

Ksharp
Super User

@RobPratt 

When I am running your code, and get these error info . 

 

 

 

 
 1          OPTIONS NONOTES NOSTIMER NOSOURCE NOSYNTAXCHECK;
 68         
 69         %let numVars = 6;
 70         
 71         data have;
 72            input id target var1-var&numVars;
 73            cards;
 
 NOTE: 数据集 WORK.HAVE 有 2 个观测和 8 个变量。
 NOTE: “DATA 语句”所用时间(总处理时间):
       实际时间          0.00 秒
       用户 CPU 时间     0.00 秒
       系统 CPU 时间     0.00 秒
       内存              672.21k
       OS 内存           19876.00k
       时间戳           2025-09-10 上午07:23:33
       Step Count                        24  Switch Count  2
       页错误数                       0
       页回收数                     139
       页交换数                        0
       主动上下文切换数        10
       被动上下文切换数      0
       块输入操作数            0
       块输出操作数           264
       
 
 76         ;
 77         
 78         proc optmodel printlevel=0;
 79            set IDS;
 80            num target {IDS};
 81            set VARS_ALL = 1..&numVars;
 82            set VARS {IDS};
 83            num a {id in IDS, VARS[id]};
 84            read data have into IDS=[id] target {j in VARS_ALL} <a[id,j]=col('var'||j)>;
 ERROR: 数组元素“VARS[1]”在第 83 行第 22 列没有值。
 NOTE: 从数据集 WORK.HAVE. 读取了 1 个观测
 85            for {id in IDS} VARS[id] = {j in VARS_ALL: a[id,j] ne .};
 ERROR: 数组元素“VARS[1]”在第 83 行第 22 列没有值。
 86         
 87            num id_this;
 88            var Select {VARS[id_this]} binary;
 89            con SumToTarget:
 90               sum {j in VARS[id_this]} a[id_this,j] * Select[j] = target[id_this];
 91         
 92            set SOLS {IDS};
 93            num SelectSol {id in IDS, SOLS[id], VARS[id]};
 94            cofor {id in IDS} do;
 95               put id=;
 96               id_this = id;
 97               solve with clp / findallsolns;
 98               SOLS[id_this] = 1.._NSOL_;
 99               for {s in SOLS[id_this], j in VARS[id_this]} SelectSol[id_this,s,j] = Select[j].sol[s];
 100           end;
 NOTE: The COFOR statement is executing in single-machine mode.
 id=1
 NOTE: 问题生成将使用 2 个线程。
 NOTE: 先前的错误可能会导致问题无法正确解决。
 ERROR: 数组元素“VARS[1]”在第 88 行第 16 列没有值。
 NOTE: 由于先前的错误,无法创建问题实例。
 101        
 102           create data want(drop=j) from [id solution j]={id in IDS, s in SOLS[id], j in VARS[id]} var=('var'||j) value=a[id,j]
 102      ! sol=SelectSol;
 ERROR: 数组元素“SOLS[1]”在第 102 行第 67 列没有值。
 103        quit;
 NOTE: 由于出错,SAS 系统停止处理该步。
 NOTE: “PROCEDURE OPTMODEL”所用时间(总处理时间):
       实际时间          0.01 秒
       用户 CPU 时间     0.01 秒
       系统 CPU 时间     0.00 秒
       内存              1077.60k
       OS 内存           20648.00k
       时间戳           2025-09-10 上午07:23:33
       Step Count                        25  Switch Count  6
       页错误数                       4
       页回收数                     501
       页交换数                        0
       主动上下文切换数        48
       被动上下文切换数      0
       块输入操作数            928
       块输出操作数           168
       
 104        
 105        OPTIONS NONOTES NOSTIMER NOSOURCE NOSYNTAXCHECK;
 115        
RobPratt
SAS Super FREQ

@Ksharp Sorry, I had oversimplfiied the code, which I have now corrected.

 

The corrected line is:

set VARS {IDS} init VARS_ALL;

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 9 replies
  • 729 views
  • 11 likes
  • 5 in conversation