Statistical programming, matrix languages, and more

IML / REGRESSION / FORMULA

Accepted Solution Solved
Reply
Occasional Contributor
Posts: 16
Accepted Solution

IML / REGRESSION / FORMULA

Despite having sorted out my problem with the attached IML code, I am looking for an alternative way to get the following done:

 

I have a fixed-length (60 months evenly spaced) vector for Y that can take whatever shape when plotting it against months in a series plot. These values belong to the gap between the pending amount of debt on a typical loan and the residual value of the object being the collateral of the loan (i.e. a car). As the residual value can be subject to commercial decisions at discrete time points (12, 24 months, ...) and the down payment can differ, the gap vector can vary a lot in terms of shape.

 

Now I want to find such ever increasing function f(x) (f'(x) >= 0 and f''(x) >= 0 expressed mathematically I suppose...) that approximates the Y values from below and never exceeds them. Furthermore the function has to return 0 until the gap vector gets positive.

 

Before solving it in IML I thought about doing it with some kind of regression. But as the regressions I know cut the cloud of observed values by the middle thus violating the restriction I dropped this idea. Furthermore I'd need a regression whose solution is an exponential formula or something like a polynom of exp functions or line segments that get steeper and steeper... 

 

 

Is there any PROC that can do what I am looking for? Is there an easy mathematical formulation for this?

 

Thanks a lot. Here's the valid solution for me that I obtained with the attached code in IML.  

function.png

 

The graph belongs to a real world vector of gaps and represents the solution I'm fine with.  

The code below uses a test vector but the logic applied is the same.

I go through the months and the GAP vector gets updated after each assignment of available increments. The available increment is determined by the maximum value I can apply from the month in course till the last month without violating the restriction that the accumulated steps can never exceed the dynamic gaps. By doing so and going ahead with the months I ensure that the resulting "function" (totalling the increments for each month) is ever increasing.

 

 

proc iml;


GAP={60 140 240 280 350 480};

print gap;

/*I make use of arithmetic series and sums to force the obj function to be increasing for such cases 
when the obj function has equal increments due to the shape of the gap function. Here for simplicity desactivated*/
n=ncol(gap);
Reihe=3;
reparto=0.0;
cake=gap[n]*reparto;

tester=1:n;

tester2=shape(tester,reihe);

do i=1 to nrow(tester2);
tester2[i,]=i/(reihe*(reihe+1)/2)/ncol(tester2);
end;

testcol=colvec(tester2);
prime=testcol*cake;
gap=gap-cusum(t(prime));

testsum=tester2[+,]*REPARTO;
testsum=testsum[,+];


print tester tester2 testsum testcol prime gap;






start Rel_dist(x, nn);

idx0=loc(x<=0);

pos=(ncol(idx0)=0);
maxdiv=j(1,nn,1);

print pos maxdiv;

if pos then do;
gap_dyn=x;
maxdiv[1:nn]=1:nn;
gap_dyn_dist=gap_dyn/maxdiv;
end;

if ^pos then do;
be=max(idx0);
gap_hembra=j(1,be,0);
gap_dyn=gap_hembra||t(x[(be+1):nn]);
maxdiv[(be+1):nn]=1:(nn-be);
gap_dyn_dist=gap_dyn/maxdiv;
end;

return(gap_dyn_dist);

finish;

TEMP=J(n,n,0);


gap_dyn=rel_dist(gap, n);

I_DYN=1;

do u=1 to n while(i_dyn=1);
idx=loc(gap_dyn>0);
gap_rel=gap_dyn[idx];
min_dyn=gap_rel[>:<,];
dim_gap=nrow(gap_rel);

if min_dyn=dim_gap then do;
temp[u,(n-dim_gap+1):n]=gap_rel[min_dyn];
i_dyn=0;
end;

if min_dyn ^= dim_gap then do;
temp[u,(n-dim_gap+1):n]=gap_rel[min_dyn];
gap=gap-cusum(temp[u,]);
gap_dyn=rel_dist(gap, n);
end;


end;

testsum2=temp[+,];
testsum2=testsum2[,+];


print gap gap_dyn idx gap_rel min_dyn temp testsum2;

ALL=TEMP//T(PRIME);

ALL=ALL[+,];
PRINT ALL;

CTR=CUSUM(ALL);

PRINT CTR;

Accepted Solutions
Solution
‎04-11-2017 11:00 AM
SAS Super FREQ
Posts: 3,225

Re: IML / REGRESSION / FORMULA

Is there a reason why you require f`` > 0? That is not required for a general nondecreasing function. For example, f(x)=log(x) is increasing everywhere but f``(x) < 0 everywhere.

 

Like KSharp, I don't understand what you are trying to achieve. Your program has no comments and I don't know what you are using for data.  I think KSharp's comment was meant to tell you that you can use the EFFECT statement to construct a truncated-power basis of spline functions. From the graph, you might want to use 

1. A basis function that is zero before month 18 and linear after. 

2. A basis function that is zero before month 36 (or 48?) and linear after

A regression on those basis functions will result in a piecewise linear regression, but I don't think you can guarantee that the fit "approximates the Y values from below and never exceeds them."

 

In PROC REG you can use the RESTRICT statement to restrict the coefficient of a linear parametric model to be positive. You will obtain the best OLS regression subject to that restriction. You can also choose to include a quadratic term and restrict the sign of the quadratic coefficient.  Again, that model will not guarantee the "from below" condition.

 

My last suggestion is that if you are modeling the cumulative counts of something, perhaps you should look into one of the procedures for survival analysis.

 

View solution in original post


All Replies
Grand Advisor
Posts: 9,320

Re: IML / REGRESSION / FORMULA

I can't understand your question completely .
This blog you might take a look.

http://blogs.sas.com/content/iml/2017/04/05/nonsmooth-models-spline-effects.html


Solution
‎04-11-2017 11:00 AM
SAS Super FREQ
Posts: 3,225

Re: IML / REGRESSION / FORMULA

Is there a reason why you require f`` > 0? That is not required for a general nondecreasing function. For example, f(x)=log(x) is increasing everywhere but f``(x) < 0 everywhere.

 

Like KSharp, I don't understand what you are trying to achieve. Your program has no comments and I don't know what you are using for data.  I think KSharp's comment was meant to tell you that you can use the EFFECT statement to construct a truncated-power basis of spline functions. From the graph, you might want to use 

1. A basis function that is zero before month 18 and linear after. 

2. A basis function that is zero before month 36 (or 48?) and linear after

A regression on those basis functions will result in a piecewise linear regression, but I don't think you can guarantee that the fit "approximates the Y values from below and never exceeds them."

 

In PROC REG you can use the RESTRICT statement to restrict the coefficient of a linear parametric model to be positive. You will obtain the best OLS regression subject to that restriction. You can also choose to include a quadratic term and restrict the sign of the quadratic coefficient.  Again, that model will not guarantee the "from below" condition.

 

My last suggestion is that if you are modeling the cumulative counts of something, perhaps you should look into one of the procedures for survival analysis.

 

Occasional Contributor
Posts: 16

Re: IML / REGRESSION / FORMULA

Dear experts,

 

I'll take note and get used to comment my code, I'll benefit most of this habit...

What I call gap function, the blue series in the plot, is generated somehow artificially with commercial intentions. It's a real business application. By playing with the parameters we ensure a positive gap towards (at least) the last months of the credit payment schedule. The costumer can redeem this gap amount by purchasing a new article giving back earlier the old one currently being financed.

To prevent the costumer from receiving a single moment impact like "now at month 49 of your financing you've earned a plus of 3000 dollar which you can activate by returning the old one and financing a new one" we create ever increasing steps.

Depending on the shape of gap function we will grant him i.e. 10 dollars monthly in the first 36 months, 15 dollars from month 37 to 43, 17 dollars from... And so forth.

 

My code does exactly this. It calculates the "spline" like segments that over the 60 month accumulate to the artificial gap amount we factored in. Therefore the step and the series interwine at month 60. The steps per month, the accumulated steps and the gap are referenced to the gap at month 60 and therefore they are expressed in %.  

 

That being said it should become clear that the accumulated steps have to be ever increasing and have to be below the gap function so that we don't get exposed to losses (which would come true when bonus given to customer exceeds the amount we are expecting to earn with the return and resale of the product.

 

Rick, your blog posts are my library and they inspire me. But I have to send out a warning that I'll pose more silly questions before achieving some expertise Smiley Happy

 

SAS Super FREQ
Posts: 3,225

Re: IML / REGRESSION / FORMULA

A colleague told me that the TRANSREG procedure supports the MSPLINE transformation, which finds a monotonically increasing transformation of X, which is used to get predicted values, which will be monotonically increasing or decreasing depending on the relationship between y and t(x). Typically you would want to specify some knots with MSPLINE, but it is not required. There is no way to force a horizontal line if the function is decreasing. All it enforces is monotonicity. Maybe the following example will be helpful:

 

data a;
call streaminit(12345);
do x = 1 to 10 by 0.1;
   y = x/3 + sin(x) + rand("normal",0,0.5);
   output;
end;
run;

proc sgplot data=a;
scatter x=x y=y;
run;

proc transreg data=a plots=(Fit Transformation);
model identity(y) = mspline(x);
run;
Post a Question
Discussion Stats
  • 4 replies
  • 138 views
  • 1 like
  • 3 in conversation