- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hello
Let's say someone sent me the regression formula to calculate PD .
I want to check which combination can create PD 0.104
So I need to run the all possible combinations and then check which of them get OD 0.104 ( can be 0.01 higher or lower than it )
Please note:
PD is calculated by :
a=sum of coefficents
PD=Exp(a) / 1+Exp(a)
The answer to my question is that the following combination provide PD 0.104.
My question- How to provide the answer via sas code ?
Data have;
input var $ value coefficient;
cards;
X1 1 -0.1
X1 2 0.25
X1 3 0
X2 1 0.35
X2 2 0
X3 1 0.75
X3 2 0
X4 1 0.95
X4 2 1.45
X4 3 0
X5 1 -1.2
X5 2 0
X6 1 -1.1
X6 2 2.2
X6 3 0
X7 1 -0.7
X7 2 0.8
X7 3 1.6
X7 4 0
X8 1 -1.1
X8 2 0
X9 1 0.6
X9 2 0.9
X9 3 0
INTER 1 -2.5
;
Run;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Did you have SAS/OR ? For a large scale problem you have to use PROC OPTMODEL.
Data have;
input var $ value coefficient;
length flag $ 80;
flag=catx(' ',var,value);
cards;
X1 1 -0.1
X1 2 0.25
X1 3 0
X2 1 0.35
X2 2 0
X3 1 0.75
X3 2 0
X4 1 0.95
X4 2 1.45
X4 3 0
X5 1 -1.2
X5 2 0
X6 1 -1.1
X6 2 2.2
X6 3 0
X7 1 -0.7
X7 2 0.8
X7 3 1.6
X7 4 0
X8 1 -1.1
X8 2 0
X9 1 0.6
X9 2 0.9
X9 3 0
INTER 1 -2.5
;
Run;
proc optmodel;
set<str> idx ;
num x{idx};
var v{idx} binary;
read data have into idx=[flag] x=coefficient ;
impvar sum_coeff = sum{i in idx} v[i]*x[i];
con con1: -2.265744524<=sum_coeff<=-2.050518502 ;
/*sum_coeff=log(pd/(1-pd) ==> 0.094->-2.265744524 0.114->-2.050518502 )*/
solve with clp;
/*solve with clp/findallsolns;*/
create data want from [combination id]={s in 1.._NSOL_,i in idx:v[i].sol>0.5} Column1=x[i] flag=v[i].sol[s];
quit;
proc print;
run;
proc sql;
select exp(sum(Column1))/(1+exp(sum(Column1))) as pd from want;
quit;
/*
data _null_;
pd=0.094;
a=log(pd/(1-pd));
put pd= a=;
pd=0.114;
a=log(pd/(1-pd));
put pd= a=;
run;
*/
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
In the output get
What does it mean?
Why there is no information about variables X1,X2,X3,X4,X5,X6 (and have only information about X7,X8,X9)?
I wanted to see all possible combinations with calcultion of PD for each of them and the binary indicatorif PD equal to the required PD in my quaestion.
It is possible that multiple combinations provide same PD
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
So you want to pick up one obs from each X1 - X9 ?
a.k.a
one obs from X1
one obs from X2
...
one obs from X9
Here is code you could get ONE solution:
Data have;
input var $ value coefficient;
cards;
X1 1 -0.1
X1 2 0.25
X1 3 0
X2 1 0.35
X2 2 0
X3 1 0.75
X3 2 0
X4 1 0.95
X4 2 1.45
X4 3 0
X5 1 -1.2
X5 2 0
X6 1 -1.1
X6 2 2.2
X6 3 0
X7 1 -0.7
X7 2 0.8
X7 3 1.6
X7 4 0
X8 1 -1.1
X8 2 0
X9 1 0.6
X9 2 0.9
X9 3 0
INTER 1 -2.5
;
Run;
proc optmodel;
set<str,num> idx ;
set row=setof{<i,j> in idx} i;
set col=setof{<i,j> in idx} j;
num x{idx};
var v{idx} binary;
read data have into idx=[var value] x=coefficient ;
impvar sum_coeff = sum{<i,j> in idx} v[i,j]*x[i,j];
con sum: -2.164326874<=sum_coeff<=-2.142863368 ;
/*sum_coeff=log(pd/(1-pd) ==> 0.103 -> -2.164326874 0.105 -> -2.142863368 )*/
con var{i in row}:sum{j in col:<i,j> in idx} v[i,j]=1;
solve with clp;
create data want from [var value]={<i,j> in idx: v[i,j]>0.5} coefficient=x[i,j] ;
quit;
proc print;run;
proc sql;
select exp(sum(coefficient))/(1+exp(sum(coefficient))) as pd from want;
quit;
/*
data _null_;
pd=0.104-0.001;
a=log(pd/(1-pd));
put pd= a=;
pd=0.104+0.001;
a=log(pd/(1-pd));
put pd= a=;
run;
*/
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
If you want to find ALL the solution, try the following code.
I found 36 solutions for pd=0.104 +/- 0.001;
Data have;
input var $ value coefficient;
cards;
X1 1 -0.1
X1 2 0.25
X1 3 0
X2 1 0.35
X2 2 0
X3 1 0.75
X3 2 0
X4 1 0.95
X4 2 1.45
X4 3 0
X5 1 -1.2
X5 2 0
X6 1 -1.1
X6 2 2.2
X6 3 0
X7 1 -0.7
X7 2 0.8
X7 3 1.6
X7 4 0
X8 1 -1.1
X8 2 0
X9 1 0.6
X9 2 0.9
X9 3 0
INTER 1 -2.5
;
Run;
proc optmodel;
set<str,num> idx ;
set row=setof{<i,j> in idx} i;
set col=setof{<i,j> in idx} j;
num x{idx};
var v{idx} binary;
read data have into idx=[var value] x=coefficient ;
impvar sum_coeff = sum{<i,j> in idx} v[i,j]*x[i,j];
con sum: -2.164326874<=sum_coeff<=-2.142863368 ;
/*sum_coeff=log(pd/(1-pd) ==> 0.103 -> -2.164326874 0.105 -> -2.142863368 )*/
con var{i in row}:sum{j in col:<i,j> in idx} v[i,j]=1;
/*solve with clp;*/
/*create data want from [var value]={<i,j> in idx: v[i,j]>0.5} coefficient=x[i,j] ;*/
solve with clp/findallsolns;
create data want from [combination var value]={s in 1.._NSOL_,<i,j> in idx: v[i,j].sol[s]>0.5} coefficient=x[i,j] ;
quit;
/*
proc print;run;
proc sql;
select exp(sum(coefficient))/(1+exp(sum(coefficient))) as pd from want;
quit;
*/
proc sort data=want;
by combination var value;
run;
proc sql;
create table want2 as
select combination,exp(sum(coefficient))/(1+exp(sum(coefficient))) as pd from want group by combination;
quit;
/*
data _null_;
pd=0.104-0.001;
a=log(pd/(1-pd));
put pd= a=;
pd=0.104+0.001;
a=log(pd/(1-pd));
put pd= a=;
run;
*/
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Here is solution I found the same as yours.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
While I have solved similar problems, I have never seen a variable called VALUE. What does it mean?
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
I think here VAR represents the name of the scoring variable, and VALUE the value of said scoring variable for a customer. In this way of presenting it, it's just a key-value pair construct.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Since your obs is so small, you could brute force to enumerate all the solutions.
Still get 36 solutions for pd=0.104 +/- 0.001 .
Data have;
input var $ value coefficient;
cards;
X1 1 -0.1
X1 2 0.25
X1 3 0
X2 1 0.35
X2 2 0
X3 1 0.75
X3 2 0
X4 1 0.95
X4 2 1.45
X4 3 0
X5 1 -1.2
X5 2 0
X6 1 -1.1
X6 2 2.2
X6 3 0
X7 1 -0.7
X7 2 0.8
X7 3 1.6
X7 4 0
X8 1 -1.1
X8 2 0
X9 1 0.6
X9 2 0.9
X9 3 0
INTER 1 -2.5
;
Run;
proc sql;
create table want as
select * from
(select value as x1_value,coefficient as x1_coefficient from have where var='X1'),
(select value as x2_value,coefficient as x2_coefficient from have where var='X2'),
(select value as x3_value,coefficient as x3_coefficient from have where var='X3'),
(select value as x4_value,coefficient as x4_coefficient from have where var='X4'),
(select value as x5_value,coefficient as x5_coefficient from have where var='X5'),
(select value as x6_value,coefficient as x6_coefficient from have where var='X6'),
(select value as x7_value,coefficient as x7_coefficient from have where var='X7'),
(select value as x8_value,coefficient as x8_coefficient from have where var='X8'),
(select value as x9_value,coefficient as x9_coefficient from have where var='X9'),
(select value as inter_value,coefficient as inter_coefficient from have where var='INTER')
where
-2.164326874<=
sum(
x1_coefficient,
x2_coefficient,
x3_coefficient,
x4_coefficient,
x5_coefficient,
x6_coefficient,
x7_coefficient,
x8_coefficient,
x9_coefficient,
inter_coefficient
)
<=-2.142863368 ;
quit;