- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
I asked a similar question in this thread and was able to solve my problem using hash find nearest technique. I had to break my dataset into 10 sets representing 'buckets' which were then compared against a set of limits unique to each bucket. I think the hash technique can do this on one data set if that 'bucket' information is set up.
A colleague of mine used matlab to skin the cat much more easily, using a matlab procedure called scatteredInterpolant. You pass (in sas speak) an observation's various x's ( y=f(x1, x2, x3) ) and it will return a y value interpolated between each of the axes or x's. In my case, the x's were operating conditions and I wanted to know my performance limit given those
This is super easy and does not incur the conservativeness when bucketing the dataset.
Does SAS have anything like this?
Accepted Solutions
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
In general, you should be careful using linear interpolation from the raw response values because the response is often noisy. All parametric and nonparametric regression techniques are essentially ways to fit a smoother through the data and use the smoother to predict ("interpolate") the response at a new point. The regression surface smooths out the noise and gives a better, more robust, fit to the data. It can also handle issues such as repeated values, where you measure the response at the (x1, x2, x3,...) point multiple times and obtain different responses each time.
Without knowing more about your data, my generic suggestion is to fit a quadratic response surface by using PROC RSREG. You can then interpolate from the response surface. PROC RSREG supports aritrarily many explanatory variables. The Getting Started example shows how to find predictions from the model.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
SAS has methods to interpolate in 3 or more dimensions using model fits. I don't think the specific method used in that MATLAB routine (Delaunay triangulation) is available in SAS, but other methods to do this using model fits include PROC TPSPLINE and PROC ADAPTIVEREG (and possibly others).
Paige Miller
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
There is reference to Delauney triangulation in SAS documentation for the G3GRID Procedure, although I have no idea of whether that makes it sufficient for solving the OP's request:
The estimates of the first, and second derivatives are computed using the n nearest neighbors of the point, where n is the number specified in the GRID statement's NEAR= option. A Delauney triangulation is used for the default method (Ripley 1981, p. 38). The coordinates of the triangles are available in an output data set, if requested by the OUTTRI= option, in the PROC G3GRID statement. This is the default interpolation method.
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set
Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets
--------------------------
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Interesting, I must have spelled Delauney wrong when I did my search because I didn't find it, but you are correct, it is in PROC G3GRID.
The difference between interpolation methods and the model fitting methods I spoke of is that the model fitting methods will smooth out noise in the data, whereas interpolation does not. The model fitting methods require a Y-variable (response variable) which may not be appropriate for the original user's problem.
Paige Miller
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Calling @Rick_SAS
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
In general, you should be careful using linear interpolation from the raw response values because the response is often noisy. All parametric and nonparametric regression techniques are essentially ways to fit a smoother through the data and use the smoother to predict ("interpolate") the response at a new point. The regression surface smooths out the noise and gives a better, more robust, fit to the data. It can also handle issues such as repeated values, where you measure the response at the (x1, x2, x3,...) point multiple times and obtain different responses each time.
Without knowing more about your data, my generic suggestion is to fit a quadratic response surface by using PROC RSREG. You can then interpolate from the response surface. PROC RSREG supports aritrarily many explanatory variables. The Getting Started example shows how to find predictions from the model.