In generalized linear model theory->maximum likelihood fitting we get the new parameter estimates as
Beta(r+1) = Beta(r) - (H^-1)*s
H = -X'WX is the hessian, and
s =SUM (w(y-Mu)x)/(V(Mu)*g'(Mu)*Phi)) is the gradient vector
My question is how do these formulas change for a parameter that is not Beta. For example the dispersion parameter in negative binomial or Scale parameter in normal, so i can estimate it as an entry in the hessian and gradient vector.
I would appreciate any reference to some literature or help with the formulas. Thanks!