Hello
I would like to ask a question please.
I have a data set with 3 fields: ID,Mis,Use
I need to calculate a new field called Mis_New by following metric:
Mis_New= min(Use*Factor,Mis);
The question is how to find the factor that will reduce SUM of MIS by 200 .
For example:
If I use Factor=1.2 then I will reduce SUM of MIS from 5780 to 5330 (difference is 450...but required difference is 200)
Any idea?
Thank you
Data tttbl1;
input ID Mis Use;
cards;
1 100 90
2 150 130
3 800 300
4 250 200
5 1100 1000
6 950 900
7 890 800
8 250 240
9 590 530
10 700 600
;
Run;
Data tttbl2;
SET tttbl1;
Mis_New= min((Use*1.2),Mis);
Run;
PROC SQL;
create table summary1 as
select sum(Mis) as Sum_Mis ,
sum(Mis_New) as Sum_Mis_New,
calculated Sum_Mis-calculated Sum_Mis_New as dif
from tttbl2
;
QUIT;
I think you are looking for something like this:
data want;
if _n_ = 1 then do until(eof);
set tttbl1 end = eof;
sum + Mis;
end;
fac = (sum - 200) / sum;
do until(last);
set tttbl1 end = last;
new = min(Use, ceil(Mis * fac));
output;
end;
drop fac sum;
run;
Thank you
I run your code but I don't find the answer.....what is the factor value that reduce by 200?
Is there a more clever way to do it?
I found that factor=2 is the solution.
Is there a better way to run the simulation?
Data tttbl1;
input ID Mis Use;
cards;
1 100 90
2 150 130
3 800 300
4 250 200
5 1100 1000
6 950 900
7 890 800
8 250 240
9 590 530
10 700 600
;
Run;
%macro mmacro(factor,run);
Data tttbl2_&run.;
SET tttbl1;
Mis_New= min((Use*&factor.),Mis);
Run;
PROC SQL;
create table SummaryData&run. as
select &factor. as factor,
&run. as run,
sum(Mis) as Sum_Mis ,
sum(Mis_New) as Sum_Mis_New,
calculated Sum_Mis-calculated Sum_Mis_New as Reduced_amount
from tttbl2_&run.
;
QUIT;
%mend;
%mmacro(1.05,1);
%mmacro(1.1,2);
%mmacro(1.15,3);
%mmacro(1.2,4);
%mmacro(1.25,5);
%mmacro(1.3,6);
%mmacro(1.4,7);
%mmacro(1.45,8);
%mmacro(1.5,9);
%mmacro(1.55,10);
%mmacro(1.6,11);
%mmacro(1.65,12);
%mmacro(1.7,13);
%mmacro(1.75,14);
%mmacro(1.8,15);
%mmacro(1.85,16);
%mmacro(1.9,17);
%mmacro(1.95,18);
%mmacro(2.0,19);
%mmacro(2.05,20);
%mmacro(2.1,21);
Data Simulations;
Set SummaryData:;
By factor;
Run;
Hello,
Edit Removed my second answer as i am not sure it really made sense.
Something like this ?
%macro mmacro(factor,run);
data tttbl2_&run.;
set tttbl1;
Mis_New= min((Use*&factor.),Mis);
sum_Mis+Mis;
sum_Mis_New+Mis_New;
Reduced_amount=sum_Mis-sum_Mis_New;
call symputx("Reduced_amount",Reduced_amount,"G");
run;
%mend;
%let MAXRUN=100;
%let TARGET=200;
%let PREC=1.0e-06;
Data tttbl1;
input ID Mis Use;
cards;
1 100 90
2 150 130
3 800 300
4 250 200
5 1100 1000
6 950 900
7 890 800
8 250 240
9 590 530
10 700 600
;
Run;
data history;
stop=0;
minfactor=1; maxfactor=10;
do run=1 by 1 while(not stop and run<&MAXRUN.);
if run=&MAXRUN. then put "WARNING: maximum number of runs reached";
factor=(minfactor+maxfactor)/2;
rc=dosubl(cats('%mmacro(',factor,',',run,');'));
Reduced_amount=symget("Reduced_amount");
put Reduced_amount=;
if abs(Reduced_amount-&TARGET.)<&PREC. then stop=1;
else if Reduced_amount<&TARGET. then do;
action="The reduced amount is too small => we decrease the factor";
maxfactor_old=maxfactor;
maxfactor=factor;
end;
else do;
action="The reduced amount is too big => we increase the factor";
minfactor_old=minfactor;
minfactor=factor;
end;
output;
end;
run;
Sorry,
I don't see where you end the macro that you defined and don't see where you run it.
The metric to calculate the value of new "Mis" field for each customer is
New_Mis=min((Use*Factor),Mis);
Where did you use it in the code please?
The macro only contains one data step that compute the reduced amount for a given factor :
%macro mmacro(factor,run);
data tttbl2_&run.;
set tttbl1;
Mis_New= min((Use*&factor.),Mis);
sum_Mis+Mis;
sum_Mis_New+Mis_New;
Reduced_amount=sum_Mis-sum_Mis_New;
call symputx("Reduced_amount",Reduced_amount,"G");
run;
%mend;
You can see that it contains your formula
Mis_New= min((Use*&factor.),Mis);
This macro is then used in the data history step, inside the do loop, for each tried value of the factor :
rc=dosubl(cats('%mmacro(',factor,',',run,');'));
dosubl is used since the reduced amount is computed in a separate data step.
The issue is like that.
Customers in bank has :
CREDIT CARD LINE (This is the maximum amount that they are allowed to use).
CREDIT USE (This is the actual amount that customer use in credit card).
As you understand CREDIT USE should be most of times lower than CREDIT CARD LINE.
Let's say that in the bank there are 1 million customers and total CREDIT CARD LINE is 10 billions Dollar.
Now let's say that there is requirement to reduce Total CREDIT CARD LINE amount by 2 billions Dollar.
Let's say that the metric that was agreed is that for each customer in the bank the NEW CREDIT CARD LINE will be calculated by:
NEW CREDIT CARD LINE=min((CREDIT USE*Factor),Current_CREDIT_CARD_LINE);
The target is to calculate the Factor that give us the desired reduce of 2 billions Dollar.
Moreover, better to have a chart that show us for each value of factor what will be the reduced amount
Hello @Ronein,
Now that you've got an algorithm approximating the factor numerically, let me add an approach aiming at a direct calculation (good point, @Reeza!):
%let threshold=200;
proc sql;
create view vtemp as
select mis, use, round(mis/use,1e-12) as r
from tttbl1
order by r;
create table totals as
select sum(mis) as s, sum(use) as t
from tttbl1;
quit;
data want(keep=factor);
if _n_=1 then set totals;
do until(last.r);
set vtemp;
by r;
if first.r & s-r*t<=&threshold then do;
factor=(s-&threshold)/t;
output;
stop;
end;
s+(-mis);
t+(-use);
end;
run;
Note: Modifications would be necessary if use=0 was a possible value in dataset TTTBL1.
@gamotte wrote:
I can only give you one like unfortunately.
Thanks! 🙂 Honestly, I'm not quite satisfied with my solution. I mentioned "direct calculation," but this was actually an overstatement: In fact, the algorithm steps through the vertices of the graph of a convex, decreasing, piecewise linear function (from left to right) until the graph crosses the threshold. Only then a direct calculation is performed (intersection of the respective line segment with the horizontal threshold line). I haven't tested the program yet on a large dataset where your nice approximation might be more efficient because it doesn't care about the line segments, but just the function values. Also, as mentioned, the current version of my program wouldn't work if use=0 for some observation.
I don't know about efficiency but it surely seems clever to me the way you recompute the reduced amount
variable at each step without having to re-read the whole dataset.
The metric to calculate the value of new "Mis" field for each customer is
New_Mis=min((Use*Factor),Mis);
Where can I see this formula in your code please?
The factor should be apply on "Use" field and not on "Mis" field
In the code that you sent I think that by mistake factor is applied on "Mis" field
Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.