10-23-2015 08:41 AM
I searched around and found primarily the solution for linear interpolation is proc expand procedure. But this doesn't seem available in SAS Base or EG. I am hoping I can receive some helps for linear interpolation in SAS base.
I have the probability and the corresponding values at each point. Probability is from 0 to 1. This set of data ( data_1) is output from an analysis. I have another set of probability data (Data_2) which starting at 0.0001 to 1 with 0.0001 incremental. My goal is to linearly interpolate the corresponding loss for each point in Data_2 based on the information in Data_1.
Hope to hear some solutions from advanced users or experts soon.
10-23-2015 08:43 AM
Sorry that Data_2 numbers are not correct . Here is the revised -
10-23-2015 01:50 PM
Most if not all analysis procedures will allow the presence of missing dependent values in the input data and provide estimated predictions for those. So, the easiest way to get interpolated values is to merge DATA_1 and DATA_2 before doing the analysis and ask for predicted values.
10-23-2015 05:03 PM
Thanks, PG Stats. I appreciate your help. but I am not quite sure about the "ask for predicted values" - are you talking about using regression model to fit? if so, the problem is that the DATA_1 is not a linear model. I only assume it is liear relationship between 2 points of data.
10-23-2015 05:29 PM
I was referring to whatever analysis DATA_1 comes from. But anyway, you can always interpolate by simple calculation:
data test; input probability Value; prabability = 0.01 * probability; datalines; 0.01 100 0.02 110 0.03 115 0.05 200 ; data inter; set test end=done; lastProb = lag(probability); lastValue = lag(value); if _n_ > 1 then do prob = lastprob to probability - 0.0001 by 0.0001; val = ((prob-lastprob)*value + (probability-prob)*lastvalue) / (probability-lastprob); output; end; if done then do; prob = probability; val = value; output; end; keep prob val; run;
10-26-2015 02:30 PM - edited 10-26-2015 02:40 PM
Thanks again, PG Stats. I think your solution is still a bit off what I am trying to do. The other set of data starts from 0.0001 and increases to 1 by 0.0001 incremental. The data set 1, which comes out of a simulation, doesn't necessary has these points and/or the starting point of 0.0001. Looks like I have to "fit" the second set of data in between the upper and lower points in data set 1 and do the linear interpolation. Does this make sense?....
prob = lastprob to probability - 0.0001 by 0.0001
this doesn't necessarily give me the value when prob=0.0001 to 1 by 0.0001 incremental
I feel like I only need a syntax to return the next data point - i.e. the lag() returns the prior values, what is the syntax for the next?
10-26-2015 02:50 PM
My code provides for interpolated values. If 0.0001 is less than your lower probability value then what you are asking for is extrapolation. Linear interpolation between point X1 and point X2 gradually goes from Y1=F(X1) to Y2. But before the first point, there is nowhere to start from.
10-28-2015 01:33 PM
Hi PG Stats I have been trying to use your solution to do the interpolation. I ran the code, but it only returns the last observation... My data has more points like below... and all the way to probability=1, have more than 70,000 obs.
dataset name = old
here is my code - replicating your solution
data new (replace=yes);
set old end=done ;
if _n_>1 then
do prob=lastprob to probability - 0.0025 by 0.0025;
if done then do;
keep prob loss;
The output looks like this
10-28-2015 05:23 PM
Hi PG States
I am looking for the dataset containing... Is that output I obtained consistent with what the code is supposed to return?
0.0025 interpolated value
0.005 interpolated value
0.0075 interpolated value
0.01 interpolated value
1 interpolated value
10-28-2015 05:37 PM - edited 10-28-2015 05:41 PM
OK then, try this:
data have; input Probability Value : comma.; datalines; 0.001000000 (83,012,900) 0.001000012 (83,012,910) 0.001000047 (83,818,600) 0.001000081 (83,720,053) 0.001000115 (83,857,022) 0.001000149 (83,620,737) 0.001000183 (83,160,434) 0.001000217 (81,145,826) 0.001000251 (83,641,370) 0.001000285 (83,674,299) 0.001000320 (82,858,739) 0.001000354 1,038,908,624 0.001000388 (83,061,936) 0.001000422 (81,738,698) 0.001000456 783,656,001 0.001000490 (83,340,855) 0.001000600 (83,340,900) ; %let step=0.0000001; data want; set have end=done; lastProbability = lag(probability); lastValue = lag(value); if _n_ = 1 then do; if mod(probability, &step) = 0 then do; prob = probability; val = value; output; end; end; else do; lastProbCat = int(lastProbability/&step); probCat = int(probability/&step); do p = lastProbCat + 1 to probCat; prob = p*&step; val = ((prob-lastProbability)*value + (probability-prob)*lastvalue) / (probability-lastProbability); output; end; end; keep prob val; run;
Note: I changed the interpolation step to a smaller value because none of your interpolation values occured within your example data.