08-23-2016 03:00 PM - edited 08-23-2016 03:02 PM
I am estimating a Tobit model using SAS version 9. About 2/3 of the observations for the dependent variable are equal to zero, although the non-zero observations are skewed with a maximum of 20 million. I came across this SAS post where it recommended that in using the "proc qlim" procedure that you should rescale variables in order to avoid missing standard errors: http://support.sas.com/kb/31/714.html. Missing standard errors is a challenge with my model since I have a number of binary dummy variables as independent variables.
However, I noticed that my results are highly sensitive to the way in which I rescale the data. In particular, changes in the magnitude of the rescaling factor for the dependent variable can result in some variables that were previously statistically significant becoming statistically insignificant. Also, rescaling some of the independent variables can change the statistical significance of other independent variables.
Is there any guidance or recommendations for best practices for scaling variables when using non-linear models? I am concerned that the results could be highly dependent on how they they are scaled, and would prefer not to have the results be based on arbitary or ad hoc scaling choices.
Any guidance or recommendations would be greatly appreciated. Thank you for your time.
08-24-2016 07:51 AM
This is more just a gut feeling as to what is going on with rescaling for this particular dataset.
With 2/3 of the data BLQ and set to zero, the estimation of the values "near" the lower level of quantitation is going to be very dependent on the scaling, hence the shift in significance. Finding a rule of thumb to work this out may be problematic.
My first thought would be to log transform the original, raw data--before anything is set to zero, and use the log lower limit of quantitation as the cutpoint where the Tobit model will kick in.
Can you share your current code?