Hi,
I would like to adjust the two variables y1 and y2 of dataset have_det so that the total over idgroup is equal to the values in have tot.
I would like to do it with proc optmodel.
the constrains are:
y1sol >=0 ,y2sol >=0
sum(i in idgroup) y1sol=y1tot
sum(i in idgroup) y2sol=y2tot
and the objective is
min sum(i in idgroup) (y1-y1sol)*2
min sum(i in idgroup) (y2-y2sol)*2
basically they are two independent problems: one for y1 and one for y2.
I would appreciate very much for the code.
I would also appreciate any suggestions on the objective function so that the solution is one more similar to the one obtained by scaling the variable(s) to the total(s), that is a raking procedure.
thank you in advance
C
I think the following does what you want, with a separate QP solve for each of the two problems.
proc optmodel;
set PROBS = 1..2;
set IDS;
num y {PROBS, IDS};
str idgroup {IDS};
read data have_det into IDS=[id] {p in PROBS} <y[p,id]=col('y'||p)> idgroup;
set <str> GROUPS;
num ytot {PROBS, GROUPS};
read data have_tot into GROUPS=[g=idgroup] {p in PROBS} <ytot[p,g]=col('y'||p||'tot')>;
num pThis;
/* declare optimization model for one problem at a time */
var yvar {IDS} >= 0;
con SumPerGroup {g in GROUPS}:
sum {i in IDS: idgroup[i] = g} yvar[i] = ytot[pThis,g];
min Objective = sum {i in IDS} (y[pThis,i] - yvar[i])^2;
num ysol {PROBS, IDS};
/* call solver once for each problem */
do pThis = PROBS;
put pThis=;
solve;
for {i in IDS} ysol[pThis,i] = yvar[i];
end;
/* create SAS output data set */
create data want from [id]
{p in PROBS} <col('y'||p)=y[p,id]>
idgroup
{p in PROBS} <col('y'||p||'sol')=ysol[p,id]>;
quit;
I don't have the subject matter expertise to address your question but I can tell from reading what you provide that your question is certainly "underspecified".
If you're asking for support then you need to help your helpers by trying to formulate as clearly as possible what you have and what you need. It's normally also really appreciated if you already share some of your code (whether already fully working or not) as this demonstrates a) the level of your SAS skills so people can gear their answers toward it and b) that you've already spent time solving the problem and are not just fishing for free work from others.
And last but not least: Provide the fully working SAS code that creates the sample "Have" data, provide the desired result and how you want to get there. This again is "helping your helpers" so they can spend the time on the bits you actually need help with.
Sample data does not need to be the real data. Make an example that is small enough that you can calculate the result by hand so that you can demonstrate what you are trying to do.
The want dataset should be similar to the one obtained as follows:
proc sql;
create table want(drop=y1totdet y2totdet) as select a.*,
y1*(y1tot/y1totdet) as y1sol,
y2*(y2tot/y2totdet) as y2sol
from (select *,sum(y1) as y1totdet,
sum(y2) as y2totdet from have_det group by idgroup) as a
left join
have_tot as b
on a.idgroup=b.idgroup
order by id;
quit;
In this case the scaling obtained via a simple sql is enough. in fact the sum over idgroup are equal to the ones in have_tot dataset:
proc means data=want sum;
class idgroup;
var y1sol y2sol;
run;
but in other cases, due to rounding the totals do not match perfectly.
moreover there are cases when, for instance, all y1 in have_det dataset are 0 and cannot be scaled to an y1tot >0.
that's why I would like to solve the problem via an optimization procedure.
third, but only optional, it would be a plus to have the solutions to be integer numbers.
Probably I have to run the scaling first and then the optmodel procedure with the problem defined on the on the scaled variables.
However my main problem here is how to write the optmodel code to work across records within idgroups. That is what I am unable to do.
Hope this is clearer.
In any case, thank you again.
I think the following does what you want, with a separate QP solve for each of the two problems.
proc optmodel;
set PROBS = 1..2;
set IDS;
num y {PROBS, IDS};
str idgroup {IDS};
read data have_det into IDS=[id] {p in PROBS} <y[p,id]=col('y'||p)> idgroup;
set <str> GROUPS;
num ytot {PROBS, GROUPS};
read data have_tot into GROUPS=[g=idgroup] {p in PROBS} <ytot[p,g]=col('y'||p||'tot')>;
num pThis;
/* declare optimization model for one problem at a time */
var yvar {IDS} >= 0;
con SumPerGroup {g in GROUPS}:
sum {i in IDS: idgroup[i] = g} yvar[i] = ytot[pThis,g];
min Objective = sum {i in IDS} (y[pThis,i] - yvar[i])^2;
num ysol {PROBS, IDS};
/* call solver once for each problem */
do pThis = PROBS;
put pThis=;
solve;
for {i in IDS} ysol[pThis,i] = yvar[i];
end;
/* create SAS output data set */
create data want from [id]
{p in PROBS} <col('y'||p)=y[p,id]>
idgroup
{p in PROBS} <col('y'||p||'sol')=ysol[p,id]>;
quit;
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how to run multiple linear regression models with and without interactions, presented by SAS user Alex Chaplin.
Find more tutorials on the SAS Users YouTube channel.