@SteveDenham wrote:
You can set the truncation value at a small non-zero value, and all of the estimates are correctly determined. The issue becomes what is the small value to use. I think a good way to choose would be to see to how many decimal places the response is measured, and then set the truncation at half that value. For example, suppose you measure the response to the nearest thousandth (=Y.YYY). Under this scheme, the truncation value of 0.0005 would guarantee that it is greater than zero, and that all observed values are included.
Or am I still missing the point here?
SteveDenham
I think your solution of tentatively selecting several thresholds and see what happens is a very nice idea. Despite the scheme you proposed was built upon selecting truncation thresholds, such attempts can be easily carried over to the selection of censoring thresholds. Therefore, I tried your approach on my data.
Before I disclose my findings, I would like to reiterate first that my original objective was to model the relationship between y and x1, x2, ..., xn. However, the SEVERITY procedure is versatile and can serve to perform multiple tasks. The more basic one is to estimate the parameters of the distribution(s) that y follow. A more advanced one is to build regression models for the scale parameter of the distribution(s) of y, e.g., the parameter μ if y follows a lognormal distribution. The latter can be done by adding the SCALEMODEL statement in the SEVERITY procedure.
In line with the capabilities of this module, my efforts of implementing your idea was also directed in two directions: (1) estimate the parameters of y in the absence of predictors; (2) estimate both the non-scale parameters of y as well as the regression coefficients of the model for the scale parameter. To accomplish the two goals, I tried several minuscule yet positive thresholds. They were of course smaller than the smallest observed positive value of my dataset.
However, it was disturbing to find out that setting different thresholds did lead to different results. For the first objective, PROC SEVERITY still exhibited some consistency, at least in the estimation of several (but not all!) distributions that were built into this module. For the second objective (i.e., in the presence of predictors), the regression coefficient estimates of the scale parameter model deviated from each other more or less, and even quite wildly on some occasions.
Therefore, the conclusion is that PROC SEVERITY is not a good tool for dealing with zero-censored data as the results is dependent on the specification of the censoring threshold. A supplement to this conclusion is that PROC SEVERITY supports multiple advanced functions relating to parameter estimation, including the specification of starting values that play a role in the maximum likelihood estimation process, the underlying method that this procedure uses to accomplish all of the aforementioned tasks. I am not sure whether delicate application of these utilities could remedy the problems I pointed out in the preceding paragraph, but I have not interest in trying it out.
... View more