I am seeking to use PROC ENTROPY to model a cost function, but I can see evidence of heteroskedasticity in the omnibus studentized residual plot that I hope to remediate. Ideally, I would like to conform to the usual "rules of road" in using the variance (i.e. squared residuals) but as I am not as familiar with PROC ENTROPY I don't want to blindly apply something from OLS that would not be appropriate under GME. For example, in reviewing the documentation for WEIGHT option in PROC ENTROPY:
The regressors and the dependent variables are multiplied by the square root of the weight variable to form the weighted matrix and the weighted dependent variable.
This differs from the implementation of the WEIGHT option in PROC GLM:
If the weights for the observations are proportional to the reciprocals of the error variances, then the weighted least squares estimates are best linear unbiased estimators (BLUE).
I do not understand why there are two different implementations of weighting between ENTROPY and GLM, thus this question.
Thanks for any insights you can offer me regarding this issue.
The ENTROPY procedure implements a parametric method of linear estimation based on generalized maximum entropy (GME).
The ENTROPY procedure is suitable when there are outliers in the data and robustness is required, when the model is ill-posed or under-determined for the observed data, or for regressions that involve small data sets.
Is your data ill-behaved? Do you have a small sample?
PROC ENTROPY estimates tend to be biased (slightly biased), as they are a type of shrinkage estimate, but typically portray smaller variances than ordinary least squares (OLS) counterparts, making them more desirable from a mean squared error (MSE) viewpoint.
If you are not dealing with ill-behaved data and/or with a small sample, there are many other ways (besides PROC ENTROPY) to deal with heteroskedasticity in regression residuals.
See here:
Ciao,
Koen
Nearly 200 sessions are now available on demand with the SAS Innovate Digital Pass.
Explore Now →ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.
Find more tutorials on the SAS Users YouTube channel.